Noting here I needed to reboot my physical host due to some kernel patches, and as such the LXC containers that hosted safenode services (50) off them were all killed by container system immediately upon the poweroff request inside the LXC.
Upon restart off the physical host and power on off the LXC, all the safenode pids started up immediately (due to service set at runlevel off default). This caused the LXC to be unreachable due to 100% cpu usage… I finally got ssh access and I basically did a killall safenode.
Then I tried to run a script that resets (intentionally) the safenode setup:
- attempts to use safenode service to stop safenode (1 through 50 nodes)
- attempts to then remove the log and data directory per safenode service
- attempts to clean up the safenode-manager directories itself
- then re-adds the safenodes 1 service at a time (1 through 50 nodes)
- then leaves it up to the user to decide when to start them…
This allows me to for now use a wrapper script to start the staggered services in a way that doesn’t make my CPU go to 100% (based on CPU load as oppose to constant --interval).
The issue that I faced is that the end result with safenode-manager with many safenode services installed is the service may not want to be set to automatic startup on boot up… maybe on failure it restarts but here all 50 safenode pids attempt to start up and eat up the entire cpu giving an unresponsive user experience to the end user, while he or she waits for the cpu to decrease (could take minutes and minutes). Others could also see this as a big concern too, especially those where their main physical host is their primary desktop with a GUI experience running 1 or more safenodes.
For now, in-case others were wondering on Alpine, I did the following in another wrapper script after adding each safenode service via safenode-manager after my reset script:
The following commands were run
install -d /etc/runlevels/safe
cd /etc/runlevels/default/
for service in safenode*; do rc-update add $service safe; done
for service in safenode*; do rc-update del $service ; done
rc-update show safe
Since the above new runlevel hasn’t been added to the bootloader configuration, this prevents it from starting up upon reboot.
I can still use safenode-manager start/status/stop --servicename $serviceName command against them.
Note: This breaks the remove --service-name $serviceName command in safenode-manager. Unless you re-migrate the service back to default prior to attempting that command etc.
Hopefully, the above information might be useful to those who want to use Alpine etc (advanced setups or advanced users etc).
Possible takeaway: Maybe it would be good to pass in a flag to dictate automatic startup or not when adding services, and let the safenode manager do the needful properly…(across SystemD or OpenRC and Windows accordingly?).
Note: I don’t know if the service is setting it to auto start on Windows and other Linux distros where it uses SystemD as oppose to OpenRC. Maybe the same concern is applicable there? Or maybe its not because its set to ‘Manual’ mode on Windows, and not enabled as part of systemctl enable command by default?
.