Node Manager UX and Issues

Dipping my toes into safenode-manager, @chriso I did not get far before not understanding the bump before me.

wyse3@wyse3:~$ sudo env "PATH=$PATH" safenode-manager add
=================================================
              Add Safenode Services              
=================================================
1 service(s) to be added
Created safe user account for running the service
Retrieving latest version for safenode...
Downloading safenode version 0.104.34-alpha.0...
Download completed
Error: 
   0: Could not obtain peers through any available options

Location:
   sn_node_manager/src/bin/cli/main.rs:410

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
4 Likes

Hey Josh,

Right, so, the node manager doesn’t have the network-contacts feature, so you have to supply a peer to connect to, either via --peer or using the SAFE_PEERS environment variable.

Maybe we should change that actually.

3 Likes

OK, I see, my confusion here was as I understood the instructions, running safenode-manager add was simply going to download safenode why is it needing a --peer should that not be required at start?

Edit: but having re-read I think I understand better.

1 Like

Yeah, I see your point. However, it’s necessary on the add command because it creates the service definition, and the --peer argument has to be supplied for the service definition.

1 Like

ok so that is all good, you mentioned that I will need to start in a specific way with port forwarding. I see no mention of how in the docs?

1 Like

Sorry, it’s actually also part of the add command. There’s a --port argument. You need to specify that manually if you’re using port fowarding. Quite soon, I’m going to change it to allow the specification of a range for adding multiple nodes.

So you would need to remove that node and add it again.

2 Likes

Sorry Chris, I hope I am not being too much of a pain here.

Shocking I know but I don’t understand something :sweat_smile:

I removed safenode1, why is the new service now safenode2, should it not just replace safenode1?

wyse3@wyse3:~$ sudo env "PATH=$PATH" safenode-manager remove --service-name safenode1
=================================================
           Remove Safenode Services              
=================================================
✓ Service safenode1 was removed
wyse3@wyse3:~$ sudo env "PATH=$PATH" safenode-manager add --peer /ip4/127.0.0.1/udp/4724/quic-v1/p2p/12D3KooWDuuWQfVPQuoJ6kjva6hHV3Q9iqgD2oUyK3VfujCBsyNy --port 8889
=================================================
              Add Safenode Services              
=================================================
1 service(s) to be added
The safe user already exists
Retrieving latest version for safenode...
Downloading safenode version 0.104.34-alpha.0...
Download completed
Services Added:
 ✓ safenode2
    - Safenode path: /var/safenode-manager/services/safenode2/safenode
    - Data path: /var/safenode-manager/services/safenode2
    - Log path: /var/log/safenode/safenode2
    - RPC port: 127.0.0.1:37617
[!] Note: newly added services have not been started

No it’s OK–I invited you to do exactly what you’re doing!

Yeah, I did see the potential for confusion with this. It’s because when you remove a node, there’s an argument you can use to keep the directories around.

The node manager marks the node as removed rather than just entirely deleting it from its state. If you run status --details you will see the removed node in the list.

I might just get rid of this, as I can see it’s been confusing at the outset.

Edit: also, another reason for this is, it’s just simple to keep incrementing a number. What would happen if you added three nodes, then deleted node two? Which number should a newly added node have?

3 Likes

I don’t think it is a problem, just a matter of learning what’s going on.

2 Likes

This is pretty cool Chris, I can see it is the way forward already and I have barely touched it.

Why is it complaining about tcp.

wyse3@wyse3:~$ sudo env "PATH=$PATH" safenode-manager start --service-name safenode2
=================================================
             Start Safenode Services             
=================================================
Attempting to start safenode2...
Error: 
   0: Could not connect to the RPC endpoint tonic::transport::Error(Transport, hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })))

Location:
   /home/runner/work/safe_network/safe_network/sn_node_manager/src/node_control/mod.rs:239

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

Right, so, first, I’m guessing that you had to use --service-name on the start command because it was also attempting to start the removed service? If that was the case, that’s a bug that needs fixed. Otherwise, if you don’t specify any argument, start will just attempt to start all nodes.

The TCP connection is for the RPC service for the node. The node may not have started correctly, and so the RPC service wasn’t launched.

What output do you get from status?

Correct, it tries to start the removed service.

wyse3@wyse3:~$ sudo env "PATH=$PATH" safenode-manager status --details
=================================================
                Safenode Services                
=================================================
============================
safenode1 - REMOVED
============================
Version: 0.104.34-alpha.0
Peer ID: -
RPC Socket: 127.0.0.1:36751
Listen Addresses: None
PID: -
Data path: /var/safenode-manager/services/safenode1
Log path: /var/log/safenode/safenode1
Bin path: /var/safenode-manager/services/safenode1/safenode
Connected peers: -

==========================
safenode2 - ADDED
==========================
Version: 0.104.34-alpha.0
Peer ID: -
RPC Socket: 127.0.0.1:37617
Listen Addresses: None
PID: -
Data path: /var/safenode-manager/services/safenode2
Log path: /var/log/safenode/safenode2
Bin path: /var/safenode-manager/services/safenode2/safenode
Connected peers: -

Right, thanks. I’ll get that bug sorted soon.

It looks like the node hasn’t started. Do you want to take a look at the logs and see what they say?

3 Likes

I wonder if there’s a way of coordinating manage and vdash node numbers?

At the moment vdash starts by assigning them incrementally but if you restart vdash it uses values stored in the .vdash checkpoint file for the node to try and assign the same number to any nodes that are still present, but will give empty slots to any new nodes that are added.

Just something that might be nice if it fits with how the manager works. I think it’s useful for GUIs to be able to maintain a simple consistent identifier between vdash restarts, or manager restarts, and ideally both! We have the peer id but it is too long for a user to remember, track or compare so I like having both an number and the id in any GUI or status.

2 Likes

Sorry, I actually haven’t ever gotten round to using vdash myself yet, so I don’t really know much about it. I wanted to start participating in the testnets with some Pis I have, and was going to get vdash running as part of that, but unfortunately I am kind of crippled with poor bandwidth on my connection at home. Still have a very poor upload speed, and running lots of nodes seems to render my connection pretty unusable.

I would be happy to try and help where I can though.

2 Likes

I don’t have any, presumably I should have it at .local/share/safe/node/xxxxxx/logs?

If you look at the status output from earlier, you should see that for nodes managed as services, the logging directories are at /var/log/safenodeX. The data directory is also at a different location.

For nodes added with the node manager, they should not be generating files at ~/.local/share/safe.

2 Likes

Well that is too obvious Chris, old dog new tricks and all that :man_facepalming: :laughing:

wyse3@wyse3:/var/log/safenode/safenode2$ cat safenode.log
[2024-02-21T16:56:38.874943Z INFO sn_peers_acquisition] Using peers supplied with the --peer argument(s) or SAFE_PEERS
[2024-02-21T16:56:38.875111Z INFO safenode] 
Running safenode v0.104.34-alpha.0
==================================
[2024-02-21T16:56:38.875122Z DEBUG safenode] Built with git version: 790d17d / alpha-test / 790d17d
[2024-02-21T16:56:38.875128Z INFO safenode] Node started with initial_peers ["/ip4/127.0.0.1/udp/4724/quic-v1/p2p/12D3KooWDuuWQfVPQuoJ6kjva6hHV3Q9iqgD2oUyK3VfujCBsyNy"]
[2024-02-21T16:56:38.875712Z INFO safenode] Starting node ...
[2024-02-21T16:56:38.880038Z WARN sn_transfers::wallet::hot_wallet] No main key found when loading wallet from path, generating a new one with pubkey: b7662c94e7a3533e3da3215bb72826e256c4edd60e2e11e30ecec7ee028b16ea1617a162028e12d8524574459e41bb06
[2024-02-21T16:56:38.883733Z INFO sn_networking::driver] Process (PID: 152056) with PeerId: 12D3KooWAswyPWHDxAgLqhTfWn175kEuXWRvhk4yXRRY1V8KvQZA
[2024-02-21T16:56:38.889350Z INFO sn_networking::driver] Self PeerID 12D3KooWAswyPWHDxAgLqhTfWn175kEuXWRvhk4yXRRY1V8KvQZA is represented as kbucket_key f111109c6fc413597fb12173e36eb48d8bff6f03766d6df9ada26e5647fc6d7c
[2024-02-21T16:56:38.897960Z INFO sn_networking::record_store] Attempting to repopulate records from existing store...
[2024-02-21T16:56:39.237082Z INFO sn_networking::network_discovery] Time to generate NetworkDiscoveryCandidates: 287.830284ms
[2024-02-21T16:56:39.248367Z DEBUG sn_logging::metrics] {"physical_cpu_threads":4,"system_cpu_usage_percent":100.0,"process":{"cpu_usage_percent":63.013695,"memory_used_mb":16,"bytes_read":249856,"bytes_written":0,"total_mb_read":11,"total_mb_written":0}}
[2024-02-21T16:56:39.237130Z INFO sn_networking::network_discovery] The generated network discovery candidates currently cover these ilog2 buckets: [(243, 1), (244, 2), (245, 5), (246, 5), (247, 5), (248, 5), (249, 5), (250, 5), (251, 5), (252, 5), (253, 5), (254, 5), (255, 5)]
[2024-02-21T16:56:39.694709Z INFO sn_peers_acquisition] Using peers supplied with the --peer argument(s) or SAFE_PEERS
[2024-02-21T16:56:39.694781Z INFO safenode] 
Running safenode v0.104.34-alpha.0
==================================
[2024-02-21T16:56:39.694790Z DEBUG safenode] Built with git version: 790d17d / alpha-test / 790d17d
[2024-02-21T16:56:39.694796Z INFO safenode] Node started with initial_peers ["/ip4/127.0.0.1/udp/4724/quic-v1/p2p/12D3KooWDuuWQfVPQuoJ6kjva6hHV3Q9iqgD2oUyK3VfujCBsyNy"]
[2024-02-21T16:56:39.694891Z INFO safenode] Starting node ...
[2024-02-21T16:56:39.697279Z INFO sn_transfers::wallet] Attempting to read wallet file
[2024-02-21T16:56:39.697400Z DEBUG sn_transfers::wallet::watch_only] Loaded wallet from "/var/safenode-manager/services/safenode2/wallet" with balance NanoTokens(0)
[2024-02-21T16:56:39.697605Z INFO sn_networking::driver] Process (PID: 152068) with PeerId: 12D3KooWAswyPWHDxAgLqhTfWn175kEuXWRvhk4yXRRY1V8KvQZA
[2024-02-21T16:56:39.697680Z INFO sn_networking::driver] Self PeerID 12D3KooWAswyPWHDxAgLqhTfWn175kEuXWRvhk4yXRRY1V8KvQZA is represented as kbucket_key f111109c6fc413597fb12173e36eb48d8bff6f03766d6df9ada26e5647fc6d7c
[2024-02-21T16:56:39.697864Z INFO sn_networking::record_store] Attempting to repopulate records from existing store...
[2024-02-21T16:56:40.069626Z INFO sn_networking::network_discovery] Time to generate NetworkDiscoveryCandidates: 367.871279ms
[2024-02-21T16:56:40.069681Z INFO sn_networking::network_discovery] The generated network discovery candidates currently cover these ilog2 buckets: [(242, 2), (244, 5), (245, 5), (246, 5), (247, 5), (248, 5), (249, 5), (250, 5), (251, 5), (252, 5), (253, 5), (254, 5), (255, 5)]
[2024-02-21T16:56:40.074161Z DEBUG sn_logging::metrics] {"physical_cpu_threads":4,"system_cpu_usage_percent":100.0,"process":{"cpu_usage_percent":24.161074,"memory_used_mb":16,"bytes_read":0,"bytes_written":0,"total_mb_read":0,"total_mb_written":0}}
[2024-02-21T16:56:40.435628Z INFO sn_peers_acquisition] Using peers supplied with the --peer argument(s) or SAFE_PEERS
[2024-02-21T16:56:40.435695Z INFO safenode] 
Running safenode v0.104.34-alpha.0
==================================
[2024-02-21T16:56:40.435703Z DEBUG safenode] Built with git version: 790d17d / alpha-test / 790d17d
[2024-02-21T16:56:40.435709Z INFO safenode] Node started with initial_peers ["/ip4/127.0.0.1/udp/4724/quic-v1/p2p/12D3KooWDuuWQfVPQuoJ6kjva6hHV3Q9iqgD2oUyK3VfujCBsyNy"]
[2024-02-21T16:56:40.435807Z INFO safenode] Starting node ...
[2024-02-21T16:56:40.441728Z INFO sn_transfers::wallet] Attempting to read wallet file
[2024-02-21T16:56:40.441846Z DEBUG sn_transfers::wallet::watch_only] Loaded wallet from "/var/safenode-manager/services/safenode2/wallet" with balance NanoTokens(0)
[2024-02-21T16:56:40.445477Z INFO sn_networking::driver] Process (PID: 152080) with PeerId: 12D3KooWAswyPWHDxAgLqhTfWn175kEuXWRvhk4yXRRY1V8KvQZA
[2024-02-21T16:56:40.445587Z INFO sn_networking::driver] Self PeerID 12D3KooWAswyPWHDxAgLqhTfWn175kEuXWRvhk4yXRRY1V8KvQZA is represented as kbucket_key f111109c6fc413597fb12173e36eb48d8bff6f03766d6df9ada26e5647fc6d7c
[2024-02-21T16:56:40.458461Z INFO sn_networking::record_store] Attempting to repopulate records from existing store...
[2024-02-21T16:56:40.854067Z INFO sn_networking::network_discovery] Time to generate NetworkDiscoveryCandidates: 391.202514ms
[2024-02-21T16:56:40.854117Z INFO sn_networking::network_discovery] The generated network discovery candidates currently cover these ilog2 buckets: [(242, 1), (243, 2), (244, 1), (245, 5), (246, 5), (247, 5), (248, 5), (249, 5), (250, 5), (251, 5), (252, 5), (253, 5), (254, 5), (255, 5)]
[2024-02-21T16:56:40.858151Z DEBUG sn_logging::metrics] {"physical_cpu_threads":4,"system_cpu_usage_percent":100.0,"process":{"cpu_usage_percent":21.686749,"memory_used_mb":17,"bytes_read":0,"bytes_written":0,"total_mb_read":0,"total_mb_written":0}}
[2024-02-21T16:56:41.197631Z INFO sn_peers_acquisition] Using peers supplied with the --peer argument(s) or SAFE_PEERS
[2024-02-21T16:56:41.197703Z INFO safenode] 

1 Like

Hmm, strange. It looks like the node is bombing out, without any obvious reason. I don’t think this is a manager issue, as such, although I could be wrong.

Not sure where to go from here on this one.

1 Like

I just pulled a random multiaddr from my other node logs, could it be that or Josh did say alpha was a potential issue. any advice on a multiaddr to use instead.

1 Like