Node Manager UX and Issues

It needs to be named winsw.exe. I think it’s case insensitive, but it won’t pick it up if the exe is called WinSW-x64.exe.

If nothing happens when you run it, that doesn’t necessarily mean it isn’t finding it. Try winsw --help or winsw /? (the latter is a Windows convention) or something like that.

Hmm it come as WinSW though and I did notice that you are saying winsw wasn’t sure about case sensitivity. But didn’t think you would be renaming it?

I have 20 minutes before I need to set off so going to give that a quick go, it’s bothering me no end.

Remember though that users won’t have to set this up manually. It will be part of safeup.

2 Likes

Alright it was a matter of needing to rename the binary.

But now it it complains about the config not existing.

1 service(s) to be added
Downloading safenode version 0.105.2...
Download completed: C:\Users\kyte7\AppData\Local\Temp\1603e6db-785d-43ea-9884-df2c6bc1638a\safenode.exe
Failed to add 1 service(s):
✕ safenode1: 2024-04-01 02:54:09,102 FATAL - Unhandled exception
System.IO.FileNotFoundException: Unable to locate winsw.[xml|yml] file within executable directory
   at WinSW.Program.LoadConfigAndInitLoggers(Boolean inConsoleMode)
   at WinSW.Program.Run(String[] argsArray, IServiceConfig config)
   at WinSW.Program.Main(String[] args)

I need to go work and you are off, so no biggy :slight_smile:

Right, thanks for your efforts!

Next time, can we just make sure we are trying this from a clean slate though? Because I think you’ve already added some services and stuff, so it may be in an odd state just now.

1 Like

Can do, your clean slate instructions are for Linux. That reset command would come in real handy.

Yeah indeed! I’m working on it now. I actually have it implemented, but just want to create some integration tests too, to make sure it works on all platforms.

3 Likes

About to try on a clean slate and it occured to me that safeup is a little legacy.
It has a nice ring to it that autonomiup doesn’t really have. Hoping we can keep safeup for a little reminder of times gone by.

Edit: fresh start on a brand new machine that has never seen node-manager exact same issue as I posted above, unable to locate winsw.xml.

2 Likes

auto_up? :wink:

1 Like

Not bad, pretty good actually but I still like the historical aspect of letting safeup slip though!

2 Likes

Coukd call it antup

1 Like

Maybe nodeup would be better. Auto has too many connections. Are we talking cars? Automatic up?

Yea node is a common thing, but will survive even another rebrand within 20 years.

3 Likes

I just got my schedule for the rest of this week and it’s going to be a bit. :confused: Thursday / Friday most likely.

2 Likes

autoruns from the sysinternal suite, helped me clean up

2 Likes

Noting here I needed to reboot my physical host due to some kernel patches, and as such the LXC containers that hosted safenode services (50) off them were all killed by container system immediately upon the poweroff request inside the LXC.

Upon restart off the physical host and power on off the LXC, all the safenode pids started up immediately (due to service set at runlevel off default). This caused the LXC to be unreachable due to 100% cpu usage… I finally got ssh access and I basically did a killall safenode.

Then I tried to run a script that resets (intentionally) the safenode setup:

  • attempts to use safenode service to stop safenode (1 through 50 nodes)
  • attempts to then remove the log and data directory per safenode service
  • attempts to clean up the safenode-manager directories itself
  • then re-adds the safenodes 1 service at a time (1 through 50 nodes)
  • then leaves it up to the user to decide when to start them…

This allows me to for now use a wrapper script to start the staggered services in a way that doesn’t make my CPU go to 100% (based on CPU load as oppose to constant --interval).

The issue that I faced is that the end result with safenode-manager with many safenode services installed is the service may not want to be set to automatic startup on boot up… maybe on failure it restarts but here all 50 safenode pids attempt to start up and eat up the entire cpu giving an unresponsive user experience to the end user, while he or she waits for the cpu to decrease (could take minutes and minutes). Others could also see this as a big concern too, especially those where their main physical host is their primary desktop with a GUI experience running 1 or more safenodes.

For now, in-case others were wondering on Alpine, I did the following in another wrapper script after adding each safenode service via safenode-manager after my reset script:

The following commands were run

install -d /etc/runlevels/safe
cd /etc/runlevels/default/
for service in safenode*; do rc-update add $service safe; done
for service in safenode*; do rc-update del $service ; done
rc-update show safe

Since the above new runlevel hasn’t been added to the bootloader configuration, this prevents it from starting up upon reboot.

I can still use safenode-manager start/status/stop --servicename $serviceName command against them.

Note: This breaks the remove --service-name $serviceName command in safenode-manager. Unless you re-migrate the service back to default prior to attempting that command etc.

Hopefully, the above information might be useful to those who want to use Alpine etc (advanced setups or advanced users etc).

Possible takeaway: Maybe it would be good to pass in a flag to dictate automatic startup or not when adding services, and let the safenode manager do the needful properly…(across SystemD or OpenRC and Windows accordingly?).

Note: I don’t know if the service is setting it to auto start on Windows and other Linux distros where it uses SystemD as oppose to OpenRC. Maybe the same concern is applicable there? Or maybe its not because its set to ‘Manual’ mode on Windows, and not enabled as part of systemctl enable command by default? :man_shrugging: .

3 Likes

I hadn’t thought of the restart scenario, so I just restarted my LXC with 50 nodes and did max max out the system, but only for about 10 seconds. Was able to ssh in within 30 seconds of seeing container had started, and after running top to check stats real quick, ‘safenode-manager status’ showed all nodes running with 40+ peers already.

Ubuntu 22
10 cores
10 gigs ram

None of this to say it shouldn’t be looked into or accounted for (It’s been discussed at length that shocking the network with nodes in early stages can be no good) just reporting my findings of another OS also in an LXC running 50 nodes on restart.

2 Likes

What version of safenode-manager are you using?

mav@safe-99:~$ safenode-manager status --servicename safenode50
error: unexpected argument '--servicename' found

Usage: safenode-manager status [OPTIONS]

For more information, try '--help'.

Sorry, I just consolidated the command text for 3 commands as one with --service-name. It only applies to add and remove for now. My fault. Good eye lol.

The case where it will be the hardest on the user is say all 50 are added, then user walks away, and reboots PC later and then the initial bootstrapping with records being fetched will definitely take much longer since none of the services were previously started and user had no say in the --interval in this scenario (controlled startup).

Advanced users can work around this stuff but yeah in general the restart scenario should be considered especially if it’s one’s primary GUI interface so to not be unresponsive at same time KISS also applies here.

I had 4 Xeon cores with 50 nodes (steady state is 25% usage with all safenodes running ) but bootstrap it uses a lot of resources (as expected) but for me getting all 50 connected is still a difficult challenge so not maxing out the CPU and doing a controlled ramp up is important for me at this stage (separate issue).

3 Likes

Right. Good point. A restart is one thing, a full bootstrap on startup is a different story. I’m in the process of testing / working through some other things with running nodes, but I’ll definitely add that to my list of things to test for the next round of nodes from a clean slate (Sorry in advance future network)

3 Likes

Thanks for the report on this.

This will involve an extension to the service-manager-rs crate, but I’ve already submitted a few PRs and the guy is generally very quick to respond.

So, it would definitely be possible to use an argument that would stop an automatic start up.

Not so easy though to decide what the default policy should be here. I don’t have a strong opinion about it either way. If most users are going to be running 50+ nodes, maybe non-auto start should be the default?

1 Like