Node Manager UX and Issues

That is prob best. But if needed you could just lock the registry file and safenode-manager waits upto 10 seconds or else aborts telling the user to try again later. And allow reset to clear the lock.

1 Like

Yeah, perhaps. I’m not sure that it would be completely required to re-write the state or not. I’ll need to examine that further when I come to do it.

2 Likes

I’m running a safenode on PunchBowl on a Mac for the first time. Which is nice. Except that I didn’t start it! I merely added it with:-

sudo ~/.local/bin/safenode-manager add --home-network --count 1

and I got:-

[!] Note: newly added services have not been started

after the add command.

Then I opened another terminal and got ready to tail -f the log file because I’m not running vdash and wanted to see the start. I was surprised to see the log dir and file already there and tailed it and it’s already blazing away storing and retrieving records!

safe node-manager still says:-

sudo ~/.local/bin/safenode-manager status --details
Refreshing the node registry...
╔═══════════════════════╗
β•‘   safenode1 - ADDED   β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
Version: 0.106.2
Peer ID: -
RPC Socket: 127.0.0.1:49601
Listen Addresses: None
PID: -
Data path: /var/safenode-manager/services/safenode1
Log path: /var/log/safenode/safenode1
Bin path: /var/safenode-manager/services/safenode1/safenode
Connected peers: -
Reward balance: 0.000000000

So it thinks it isn’t started.

and yet:-

[2024-05-11T16:55:17.873457Z INFO sn_networking::record_store] Retrieved record from disk! filename: f2fa905f523e789e279015793e3098c642279d0fd03bbdf74e1ff04b8302086f after 487.232Β΅s
[2024-05-11T16:55:18.072362Z INFO sn_networking::record_store] Retrieved record from disk! filename: d0706533f35bef2f6ff9662fbb641cb05abf127eaaab8bad2d18e4a987c78d71 after 392.305Β΅s
[2024-05-11T16:55:18.093172Z INFO sn_networking::record_store] Retrieved record from disk! filename: fd2f97535efa85f0d1bd2076b32dd48d3442baeb2cdbc8723e5f4cd4891fd921 after 395.471Β΅s
[2024-05-11T16:55:18.110161Z INFO sn_networking::record_store] Retrieved record from disk! filename: d1829804098bc171a30b50636298fb3a98b3142491d957ad3853a9c8b77e8607 after 378.847Β΅s
[2024-05-11T16:55:19.347999Z INFO sn_networking::event::swarm] Dcutr with remote peer: PeerId("12D3KooWRL7MXFwrqztK2FJs4D25MRt96XjgiqK77UJYohyWacFD") is: Ok(ConnectionId(5021))
[2024-05-11T16:55:20.563557Z DEBUG sn_logging::metrics] {"physical_cpu_threads":4,"system_cpu_usage_percent":10.186605,"process":{"cpu_usage_percent":3.0407562,"memory_used_mb":70,"bytes_read":0,"bytes_written":0,"total_mb_read":257,"total_mb_written":65}}
[2024-05-11T16:55:24.991208Z INFO sn_networking::event::swarm] relay client event event=InboundCircuitEstablished { src_peer_id: PeerId("12D3KooWFHaPkpgnojof5cUsbPPozESkSZrXgW39JGSHVwZRLQim"), limit: Some(Limit { duration: Some(120s), data_in_bytes: Some(131072) }) }

etc

So I tried a stop and got:-

sudo ~/.local/bin/safenode-manager stop                        
╔════════════════════════════╗
β•‘   Stop Safenode Services   β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
Refreshing the node registry...
Service safenode1 has not been started since it was installed

So I tried a start and got:-

sudo ~/.local/bin/safenode-manager start
╔═════════════════════════════╗
β•‘   Start Safenode Services   β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
Refreshing the node registry...
Attempting to start safenode1...
βœ“ Started safenode1 service
  - PID: 1272
  - Bin path: /var/safenode-manager/services/safenode1/safenode
  - Data path: /var/safenode-manager/services/safenode1
  - Logs path: /var/log/safenode/safenode1

And now status shows:-

sudo ~/.local/bin/safenode-manager status
╔═══════════════════════╗
β•‘   Safenode Services   β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
Refreshing the node registry...
Service Name       Peer ID                                              Status  Connected Peers
safenode1          12D3KooWHkWWjjbbK71sdXGEHW3uHQ7bE93CdNVohXdMHdxMhhaK RUNNING              37

So I’ve ended up in the right place now but it was a bit odd.

I didn’t have any services setup before. This was a fresh install of safeup and safe node-manager.

safeup 0.7.0
sn-node-manager 0.7.5

1 Like

Hmm, I don’t really see any code route to the node manager starting a service after add.

Are you sure that this wasn’t a node that was already running or something? Was this a fresh machine?

I had got as far as installing safenode-manager for the last one but never ran a node. safenode-manager didn’t have any nodes showing according to status. I didn’t check the registry though. I think I’ll tear it all down, delete the registry file and make sure it’s all absolutley clean.

I’ve just tried again and got the same behaviour. I stopped the node, did a safe node-manager reset, made sure the /var/safenode-manager/services/safenode1/safenode file wasn’t there. Then added the node and the node starts running!

Then it’s the same thing again: when I try to stop the node it says Service safenode1 has not been started since it was installed

So I do a start and then I can stop it.

Just tried again adding 3 safenodes and they all start up.

Is there maybe something apart from the files in /var/safenode-manager/services that gets added and is Mac specific?

Edit
Forgot to mention that I’ve seen a node in this state of not having been explicitly started but nevertheless still running earn in this state. So I’m sure it’s a bona fide node. It’s just something weird and undoubtedly Mac related. Must be something about the way that services are handled. Despite using OSX since 10.1 I don’t know much about it β€˜under the hood’. I just use it as a nice desktop and terminal to other things.

@chriso
The failed to start is back in force. This is after a reset.

wyse1@wyse1:~$ safeup node-manager
**************************************
*                                    *
*    Installing safenode-manager     *
*                                    *
**************************************
Installing safenode-manager for x86_64-unknown-linux-musl at /home/wyse1/.local/bin...
Retrieving latest version for safenode-manager...
Installing safenode-manager version 0.7.5...
  [00:00:02] [########################################] 5.39 MiB/5.39 MiB (0s)                                                                                                                                                                                safenode-manager 0.7.5 is now available at /home/wyse1/.local/bin/safenode-manager
wyse1@wyse1:~$ sudo safenode-manager add --count 5 --version 0.106.2 --home-network
╔═══════════════════════════╗
β•‘   Add Safenode Services   β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
5 service(s) to be added
The safe user already exists
Download completed: /var/safenode-manager/downloads/safenode
Services Added:
 βœ“ safenode1
    - Safenode path: /var/safenode-manager/services/safenode1/safenode
    - Data path: /var/safenode-manager/services/safenode1
    - Log path: /var/log/safenode/safenode1
    - RPC port: 127.0.0.1:41373
 βœ“ safenode2
    - Safenode path: /var/safenode-manager/services/safenode2/safenode
    - Data path: /var/safenode-manager/services/safenode2
    - Log path: /var/log/safenode/safenode2
    - RPC port: 127.0.0.1:44127
 βœ“ safenode3
    - Safenode path: /var/safenode-manager/services/safenode3/safenode
    - Data path: /var/safenode-manager/services/safenode3
    - Log path: /var/log/safenode/safenode3
    - RPC port: 127.0.0.1:37381
 βœ“ safenode4
    - Safenode path: /var/safenode-manager/services/safenode4/safenode
    - Data path: /var/safenode-manager/services/safenode4
    - Log path: /var/log/safenode/safenode4
    - RPC port: 127.0.0.1:44549
 βœ“ safenode5
    - Safenode path: /var/safenode-manager/services/safenode5/safenode
    - Data path: /var/safenode-manager/services/safenode5
    - Log path: /var/log/safenode/safenode5
    - RPC port: 127.0.0.1:38385
[!] Note: newly added services have not been started
wyse1@wyse1:~$ sudo safenode-manager start
╔═════════════════════════════╗
β•‘   Start Safenode Services   β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
Refreshing the node registry...
Attempting to start safenode1...
Attempting to start safenode2...
Attempting to start safenode3...
Attempting to start safenode4...
Attempting to start safenode5...
Failed to start 5 service(s):
βœ• safenode1: The 'safenode1' service has failed to start
βœ• safenode2: The 'safenode2' service has failed to start
βœ• safenode3: The 'safenode3' service has failed to start
βœ• safenode4: The 'safenode4' service has failed to start
βœ• safenode5: The 'safenode5' service has failed to start
Error: 
   0: Failed to start one or more services

Location:
   sn_node_manager/src/cmd/node.rs:576

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

wyse1@wyse1:~$ systemctl status safenode1.service
● safenode1.service - safenode1
     Loaded: loaded (/etc/systemd/system/safenode1.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2024-05-11 15:04:56 EDT; 4h 23min ago
   Main PID: 1501 (safenode)
      Tasks: 11 (limit: 8734)
     Memory: 33.7M
        CPU: 9min 35.136s
     CGroup: /system.slice/safenode1.service
             └─1501 /var/safenode-manager/services/safenode1/safenode --rpc 127.0.0.1:45041 --root-dir /var/safenode-manager/services/safenode1 --log-output-dest /var/log/safenode/safenode1 --home-network

May 11 15:04:56 wyse1 systemd[1]: Started safenode1.
May 11 15:04:56 wyse1 safenode[1501]: Logging to directory: "/var/log/safenode/safenode1"
May 11 15:05:17 wyse1 safenode[1501]: Node started
May 11 15:05:17 wyse1 safenode[1501]: PeerId is 12D3KooWJ2KBg6gxph9xMVdK8tV5QUTTbmA1HdiHXjJqjr3sRGaS
May 11 15:05:17 wyse1 safenode[1501]: You can check your reward balance by running:
May 11 15:05:17 wyse1 safenode[1501]: `safe wallet balance --peer-id=12D3KooWJ2KBg6gxph9xMVdK8tV5QUTTbmA1HdiHXjJqjr3sRGaS`
May 11 15:05:17 wyse1 safenode[1501]:     
May 11 15:05:17 wyse1 safenode[1501]: RPC Server listening on 127.0.0.1:45041
May 11 16:45:23 wyse1 systemd[1]: safenode1.service: Current command vanished from the unit file, execution of the command list won't be resumed.

wyse1@wyse1:~$ sudo safenode-manager stop
╔════════════════════════════╗
β•‘   Stop Safenode Services   β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
Refreshing the node registry...
Service safenode1 has not been started since it was installed
Service safenode2 has not been started since it was installed
Service safenode3 has not been started since it was installed
Service safenode4 has not been started since it was installed
Service safenode5 has not been started since it was installed
systemctl list-units --type=service | grep safenode | grep running
● safenode1.service                                     not-found active running safenode1.service
● safenode10.service                                    not-found active running safenode10.service
● safenode12.service                                    not-found active running safenode12.service
● safenode15.service                                    not-found active running safenode15.service
● safenode16.service                                    not-found active running safenode16.service
● safenode17.service                                    not-found active running safenode17.service
● safenode18.service                                    not-found active running safenode18.service
● safenode19.service                                    not-found active running safenode19.service
● safenode2.service                                     not-found active running safenode2.service
● safenode20.service                                    not-found active running safenode20.service
● safenode21.service                                    not-found active running safenode21.service
● safenode22.service                                    not-found active running safenode22.service
● safenode23.service                                    not-found active running safenode23.service
● safenode24.service                                    not-found active running safenode24.service
● safenode25.service                                    not-found active running safenode25.service
● safenode26.service                                    not-found active running safenode26.service
● safenode27.service                                    not-found active running safenode27.service

this solved my issues ps aux | grep safenode | grep -v grep | awk '{print $2}' | xargs sudo kill

1 Like

Is that not just sudo pkill safenode ?

5 Likes

Where’s the fun in that?!

Or indeed killall safenode (killall is fairly new to me as well).

1 Like

But how.come did the firewall let the chunks in, but not the money?

Sorry @josh that sounded bad when I read it. It was not meant to be a slagging, but sounded like it, so you owe me one :wink:

4 Likes

None taken, didn’t think you were.

I know nothing about systemctl or journalctl which Chris suggested I look at so I roped in our friend mr ai who tried to help me figure out why I couldn’t start nodes despite, reset and manager telling me none were running.

It suggested that and I obediently followed. :slightly_smiling_face: it worked I didn’t complain.

1 Like

My node could churn the required chunks to it. But it could not randomly receive requests for quotes from clients so nobody wanted to upload chunks to me and pay (which also would not have worked anyhow). Also I suspect since I was talking to my peers the firewall processing in the computer let through packets from my peers whom I was talking to since those packets looked like replies to me. But clients are not being talked to and the firewall rightly blocked them

2 Likes

When we get forward with the development, the AutoNAT and IGDpnp are meant to work towards clients too, not just with other nodes, right?

Clients don’t need it.

I was talking of a packet that the client would send to my node and the pc the node is on rejects it. The client did everything right, just the firewall blocked because the port was not opened on my pc

3 Likes

any idea Rob?

sudo safenode-manager add --count 45 --version 0.106.2 --node-port xxxx-xxxx
╔═══════════════════════════╗
β•‘   Add Safenode Services   β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
45 service(s) to be added
The safe user already exists
Downloading safenode version 0.106.2...
Download completed: /var/safenode-manager/downloads/safenode
Failed to add 45 service(s):
βœ• safenode1: Failed to enable unit: "multi-user.targetget" is not a valid unit name.

βœ• safenode2: Failed to enable unit: "multi-user.targetget" is not a valid unit name.

βœ• safenode3: Failed to enable unit: "multi-user.targetget" is not a valid unit name.
1 Like

Nope.

What drive setup have you got? (no answer required, just consider for yourself) Maybe node-manager is specifying the directory in a way that isn’t translating very well, or the OS is getting upset.

Also they were low port numbers. High enough AFAIK but maybe conflicts with other Apps

Those are what I have always used, never seen that before. I have been using --home-network lately so wasn’t sure if I am missing a trick.

I always choose different listening ports because there maybe other nodes out there still trying to talk my old nodes and the new nodes go WHAT? Also with home-network there is no need to specify ports, nor port-forward any ports, or open firewall ports

1 Like

I think I had the same issue on one machine - happened after I messed with the services… I had /etc/systemd/system/safenodeX.service files (and maybe the service is even enabled and references somewhere… So maybe you first do a systemctl disable safenodeX before deleting the service files…

I assume the node manager is out of sync with your system status

1 Like