Node Manager UX and Issues

Will there be some form of stats shown in CLI to know if the nodes are running and working correctly?

No problem. Looking forward to the ER once the dependencies are met with that crate. I won’t need it right away as now I am limited by my ISP’s not so great router (that shouldn’t be acting as a router), that can only handle say 10 to 20 safenodes pids due to the sheer # of connections each of the safenode pids are making with other peers. Hoping to solve that problem for now… and then circle back to a more graceful solution with the auto startup or not auto startup issue.

It could be the following too, given a majority might simply do 1 node (beginners):

–count = 1 (auto-start)
–count > 1 (not auto start)
–auto-start-mode flag passed in (then use the flag whatever its set to either auto start or not, up to the user to decide) at the time of setup?

1 Like

I would think a staged startup if automatic. 1 per minute perhaps. (or 30 secs or …)

1 Like

Around here I am in a small city running Cable with 50mb/sec cable ISP at 86.00/mth CDN. The infrastructure is a bit old so suffers from brownouts, so while my laptop stays up, the network goes down, but the node get’s punished for node bad behavior and is shunned? hmmm.

Imo the shunning decision made by the close group should first have other nodes in the close group should gossip inquire via udp to find out the ture nature of that ā€˜down and backup node’ symptom before ruling on the node’s ability to perform , and that node should ack report ā€˜I have been up’, but the network was (the culprit) down, therefore no shunning and downgrading of my node(s) within the affected node’s ā€˜consensus’ close group?

n.b- As Canada descends into 2nd world status, and maybe third world, its only gonna get worse over here, before it gets better, especially in small town and rural areas, the infrastructure is stuck at 50mbits/sec in a lot of places…

Thoughts?

2 Likes

@chriso

sill intermittently getting the nodes added when they should be started.

safenode56         12D3KooWRu1wndZponsm3g6Gv5t57CWwtrJrkue8DDhLX8xJbQia RUNNING              20
safenode57         12D3KooWFFtxDXzXmn5eQCYPWAUJfA9tspr4JAG6Y9bDvLjC7vqH RUNNING              26
safenode58         12D3KooWDZqbVp8MJYTZrm9KxCwEYAMdo2sJPvD22hMEM3q9iSot RUNNING              30
safenode59         -                                                    ADDED               -
safenode60         -                                                    ADDED               -

Attempting to start safenode59...
āœ“ Started safenode59 service
  - PID: 116873
  - Bin path: /var/safenode-manager/services/safenode59/safenode
  - Data path: /var/safenode-manager/services/safenode59
  - Logs path: /var/log/safenode/safenode59
Attempting to start safenode60...
āœ“ Started safenode60 service
  - PID: 121439
  - Bin path: /var/safenode-manager/services/safenode60/safenode
  - Data path: /var/safenode-manager/services/safenode60
  - Logs path: /var/log/safenode/safenode60

would it be a big hassle to add the Per ID to the output of safenode-manager ?
as the script I am using to monitor resources runs the safenode-manager status
command every time it gets the resources to find out the Peer ID’s.
it would be easier to just Tee the manager output into a file and read it directly as required.

something like this

Attempting to start safenode60...
āœ“ Started safenode60 service
  - Peer ID: 12D3KooWDZqbVp8MJYTZrm9KxCwEYAMdo2sJPvD22hMEM3q9iSot
  - PID: 121439
  - Bin path: /var/safenode-manager/services/safenode60/safenode
  - Data path: /var/safenode-manager/services/safenode60
  - Logs path: /var/log/safenode/safenode60
1 Like

I’ll look into this. It may seem trivial, but the reason the peer ID wasn’t included was to make the service management code uniform between the node, faucet and RPC daemon. The faucet and the daemon don’t have peer IDs.

2 Likes

@chriso did you see this comment? This is not a concern to me so no rush, I am only trying to run things on windows to help so if the fix is down the line all good.

I just noticed that in the update it says windows users need winsw to use node-manager. In my experience thus far simply having winsw on PATH has not worked.

1 Like

Yeah I did see it, sorry. Yesterday I spent my whole day working on that alpha network that we put out yesterday. I’m hoping to try and reproduce the problem on my own VM today.

3 Likes

Hey @anon26713768

So I tested it on my VM and everything is working OK. I think the problem is, you were using version 2 of WinSW, but the node manager has been developed against version 3. You need to download this one:

Apologies, this was not at all obvious.

As before, save WinSW-x64.exe as winsw.exe and put it somewhere on the path. Verify that it’s available by using winsw --help in the Powershell session. Then use the node manager.

I’ve attached a screenshot which shows mine running OK.

1 Like

Nice, I am off to the races on the alpha network. I looked at that release and figured nah it is pre-release won’t be that. Thanks Chris!

PS C:\Windows\system32> safenode-manager status
=================================================
                Safenode Services
=================================================
Refreshing the node registry...
Service Name       Peer ID                                              Status  Connected Peers
safenode1          12D3KooWL6fy7Vc4GY8ec4ZY8mPfQ3oAmAPzRN7dyTBRyGgBzSfa ←[32mRUNNING←[0m               2
safenode2          12D3KooWNxiPV4MMm4ZXeT9FWsf2GABhEoHWP2LSQzzhqTTVc2Pm ←[32mRUNNING←[0m              12
safenode3          12D3KooWJrPqLu6HDpKQhmZN5Ra48C3CqfujAcb7HwUjUBDgUyJG ←[32mRUNNING←[0m               7
safenode4          12D3KooWCsWa3Tf4hiUEMxSbCdqiiKdoHSWhzhko64JNLy3y9r2H ←[32mRUNNING←[0m               2
safenode5          12D3KooWPKSnJyTR3ZjXZpzGYa56UceVyjg3j3yV4QymPh4pBF8d ←[32mRUNNING←[0m             152
2 Likes

Trying to start up 10 nodes on my server that already has nodes running:

sudo /home/safea/.local/bin/safenode-manager add --count 10 --node-port 14600-14609 --peer /ip4
/161.35.165.63/udp/56759/quic-v1/p2p/12D3KooWGBCKLkgwv8YWb3KQBwi9fG4GYMWhYcouKSi1TLe6qm3f
=================================================
              Add Safenode Services
=================================================
10 service(s) to be added
The safe user already exists
Retrieving latest version for safenode...
Downloading safenode version 0.105.6-alpha.4...
Download completed: /tmp/8979e40e-2cca-49b5-9fe1-c55bdb1b7120/safenode
Services Added:
 āœ“ safenode51
    - Safenode path: /var/safenode-manager/services/safenode51/safenode
    - Data path: /var/safenode-manager/services/safenode51
    - Log path: /var/log/safenode/safenode51
    - RPC port: 127.0.0.1:32867
 āœ“ safenode52
    - Safenode path: /var/safenode-manager/services/safenode52/safenode
    - Data path: /var/safenode-manager/services/safenode52
    - Log path: /var/log/safenode/safenode52
    - RPC port: 127.0.0.1:41717
 āœ“ safenode53
...... *truncated
 āœ“ safenode59
    - Safenode path: /var/safenode-manager/services/safenode59/safenode
    - Data path: /var/safenode-manager/services/safenode59
    - Log path: /var/log/safenode/safenode59
    - RPC port: 127.0.0.1:44373
 āœ“ safenode60
    - Safenode path: /var/safenode-manager/services/safenode60/safenode
    - Data path: /var/safenode-manager/services/safenode60
    - Log path: /var/log/safenode/safenode60
    - RPC port: 127.0.0.1:46365
sudo /home/safea/.local/bin/safenode-manager start
=================================================
             Start Safenode Services
=================================================
Refreshing the node registry...
The safenode1 service is already running
The safenode2 service is already running
The safenode3 service is already running
The safenode4 service is already running
The safenode5 service is already running
The safenode6 service is already running
The safenode7 service is already running
......
The safenode48 service is already running
The safenode49 service is already running
The safenode50 service is already running

And it stopped there. Trying to start safenode51:

safea@safe-99:/home/mav$ sudo /home/safea/.local/bin/safenode-manager start --service-name safenode51
=================================================
             Start Safenode Services
=================================================
Refreshing the node registry...
Error:
   0: No service named 'safenode51'

Location:
   sn_node_manager/src/cmd/node.rs:391

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
safea@safe-99:/home/mav$

Does it have a limit of 50 hardcoded somewhere?

edit: Hmm… I can keep trying to ā€œaddā€ and it always does 51-60. Like it’s not updating the registry.

edit2: More info

safea@safe-99:/home/mav$ cat /etc/systemd/system/safenode51.service
[Unit]
Description=safenode51
[Service]
ExecStart=/var/safenode-manager/services/safenode51/safenode --rpc 127.0.0.1:45725 --root-dir /var/safenode-manager/services/safenode51 --log-output-dest /var/log/safenode/safenode51 --port 14600 --peer /ip4/161.35.165.63/udp/56759/quic-v1/p2p/12D3KooWGBCKLkgwv8YWb3KQBwi9fG4GYMWhYcouKSi1TLe6qm3f
Restart=on-failure
User=safe
[Install]
WantedBy=multi-user.target

Started 51 with systemctl, did nothing with 52.

safea@safe-99:/home/mav$ sudo systemctl start safenode51
safea@safe-99:/home/mav$ systemctl status safenode51
* safenode51.service - safenode51
     Loaded: loaded (/etc/systemd/system/safenode51.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2024-04-04 22:30:41 UTC; 9s ago
   Main PID: 3227977 (safenode)
      Tasks: 23 (limit: 154418)
     Memory: 18.0M
        CPU: 207ms
     CGroup: /system.slice/safenode51.service
             `-3227977 /var/safenode-manager/services/safenode51/safenode --rpc 127.0.0.1:45725 --root-dir /var/safenod>
safea@safe-99:/home/mav$ systemctl status safenode52
* safenode52.service - safenode52
     Loaded: loaded (/etc/systemd/system/safenode52.service; enabled; vendor preset: enabled)
     Active: inactive (dead)
1 Like

Hey, thanks for the report on this. There certainly isn’t a limit of 50. This could be related to the issue that had been seen by neik and Shu before.

Hey @JPL would you please be able to move this post into the node manager UX/issues thread? I’m going to try and take a look at this tomorrow and see if I can reproduce these issues with lots of nodes. I don’t think this is anything specific to this alpha net.

4 Likes


There’s a couple of things here.
First that I started safenode51-safenode55 manually with systemctl
Second that the ID in the TIG stack script changed from my peerID to the name of the service for the new manually started nodes
Third, I have not restarted / changed the script that’s running in the background.

What changes when starting manually that the ID would change from the script (Sorry, I forget who actually made the TIG stack script of I’d tag them)

How is the dashboard picking up the new nodes but safenode-manager still will not?

1 Like

It’s all a bit borked @wes

The influx-resources job in your /etc/cron.d/
Outputs to a file /tmp/influx-resources/influx-resources

You can manually tail that file and see the raw data that is getting sent to influxdb by telegraf.

It gets the Peer IDs from safe node manager is not displaying the correct info I think it is double reading some nodes.

Not sure which version you are running lately iv dropped the frequency to 15 min originally I had it running at 5 min frequency.

1 Like

It just looks in the folder where the nodes store there records

@chriso not sure if this really matters but, status throws a error when it is not run as administrator.

PS C:\Users\kyte7> safenode-manager status
=================================================
                Safenode Services
=================================================
Refreshing the node registry...
Service Name       Peer ID                                              Status  Connected Peers
safenode1          12D3KooWL6fy7Vc4GY8ec4ZY8mPfQ3oAmAPzRN7dyTBRyGgBzSfa ←[32mRUNNING←[0m               9
safenode2          12D3KooWNxiPV4MMm4ZXeT9FWsf2GABhEoHWP2LSQzzhqTTVc2Pm ←[32mRUNNING←[0m               8
safenode3          12D3KooWJrPqLu6HDpKQhmZN5Ra48C3CqfujAcb7HwUjUBDgUyJG ←[32mRUNNING←[0m              11
safenode4          12D3KooWCsWa3Tf4hiUEMxSbCdqiiKdoHSWhzhko64JNLy3y9r2H ←[32mRUNNING←[0m               7
safenode5          12D3KooWPKSnJyTR3ZjXZpzGYa56UceVyjg3j3yV4QymPh4pBF8d ←[32mRUNNING←[0m             159
Error:
   0: ←[91mAccess is denied. (os error 5)←[0m

Location:
   ←[35mD:\a\safe_network\safe_network\sn_node_manager\src\cmd\node.rs←[0m:←[35m232←[0m

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
1 Like

Hmm, thanks. That must be Windows specific, because I went out of my way to ensure it didn’t need sudo on Linux. Will have to look into that.

1 Like

TIL - I was trying to crontab all the users and couldn’t find anything. I knew it was on cron job, but didn’t know you could just create a file in cron.d. Very cool.

2 Likes

I’m just playing now with a new version that drops out the peer id so doesn’t uses the safe node manager status command to see if thats what was borking things.

I’ll let you know how it goes

2 Likes

It doesn’t?!?!?.
Ill have to look into that.

Got to run out for a bit before I go I noticed a discrepancy between the status reported by vdash and nodemanager.
Nodemanager claiming nodes were inactive, vdash said they were fine…

1 Like