Node Manager UX and Issues

As far as I know, we could use any peer from the testnet. Any that would be available in the thread.

Edit: I’ll see if I can connect on my Pi.

1 Like

Btw, how did you get a node manager binary?

1 Like

wget from latest release.

changed the peer and :partying_face: thanks Chris!!

=================================================
             Start Safenode Services             
=================================================
Attempting to start safenode3...
✓ Started safenode3 service
  - Peer ID: 12D3KooWRkm55qj2jX7VP4FB2P6mFRdEGhD6kmgfrHYFRowWoYNp
  - Logs: /var/log/safenode/safenode3

Ah ok, nice one.

You can add more nodes using the --count argument. However, it doesn’t apply at the moment for running a node at home with port forwarding, because that requires specifying the port manually. So each node has to be added individually. Like I said though, soon I will change the --port argument to accept a range. So if you’ve used something like 12000-12100 as a range, we can specify that in the --port argument, and use it along with --count.

2 Likes

Regarding upgrades, the idea would be to run upgrade to get on the latest version for all the nodes that have been added using the manager. It will retain the peer ID and data for each node.

3 Likes

Yes I specify a range so that will be great.

I think I drew this out a bit more than needed and then the dodgy --peer so it really could have been done far more quickly. love it!

2 Likes

That’s OK. It was a good exercise to go through.

2 Likes

Glad you pair did it now I have an idea of what to do I’ll need to wait till the port range before I can have a go :slight_smile:

2 Likes

Cool. I’ll get that in soon.

2 Likes

@chriso I just checked in on that node I started and it has no records.
First guess is that I am found to be behind NAT but my port forwarding rules have not changed and I used one to start this that I ordinarily use.
status says 0 connected peers.

1 Like

OK, thanks. I don’t think I could comment much further on that, because I don’t think it’s an issue with the manager. Let’s see how things go with the next testnet.

1 Like

Yes it is not the manager, discovered that it is the node version that is causing my issue.

@chriso when dealing with multiple nodes it would be nice to be able to do stop --service-name safenode3 safenode4 but it seems I can only do one at a time?

3 Likes

Sure I can add that.

Btw, if you use stop without any arguments, that would stop all the services.

Edit: although it could be subject to the same bug with starting the services.

1 Like

Awesome, yes I figured but I didn’t also want to kill the node that was working. So being able to selectively start, stop, remove multiple at a time would be :fire:

2 Likes

When building the ValentinesNet build, I get the following error with sn-node-manager:

safe-build-122:/.../safe_github/safe_network# cargo build --release --features metrics --features open-metrics
warning: dropping unsupported crate type `cdylib` for target `x86_64-unknown-linux-musl`

warning: `sn_networking` (lib) generated 1 warning
warning: `sn_client` (lib) generated 1 warning (1 duplicate)
   Compiling sn-node-manager v0.3.1 (/.../safe_github/safe_network/sn_node_manager)
error: linking with `cc` failed: exit status: 1
....
  = note: /usr/lib/gcc/x86_64-alpine-linux-musl/12.2.1/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find -lbz2: No such file or directory
          collect2: error: ld returned 1 exit status
          

error: could not compile `sn-node-manager` (bin "safenode-manager") due to 1 previous error

This is with branch: sn_node-v0.104.22 when compiling on Alpine LXC:

safe-build-122:~# cat /etc/os-release 
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.17.0
PRETTY_NAME="Alpine Linux v3.17"

I haven’t figured out yet how to fix the linker requirements with bz2. I tried:

apk add libbz2
apk add bzip2-dev

But it continues to give same error. Anyone else encountering this issue?

@chriso - I started experimenting with safenode-manager and how I could incorporate that into my own setup at home (if it makes sense to). Thank you for building this feature set out!

Below was a summary of my experience first time going through it on the latest release. I must note here I am trying this on a LXC container so that is not likely a typical use unlike a full blown Ubuntu OS etc:

  • In my setup, I assign RPC port to be a constant base # + node # + common RPC offset systematically per safenode that is spun up. This allows me to deterministically access the RPC from an external program or probe inside the different LXC or across all LXCs if required. It seems it doesn’t accept a manual override here for the RPC port, which might be okay for most users, but then they need to parse the json from safenode-manager status to extract out that RPC port, and use it as they see fit:
    error: invalid value '192.168.X.X:Y' for '--rpc-address <RPC_ADDRESS>': invalid IPv4 address syntax
    
  • It attempts to add user using useradd if no --user is specified, though that useradd command doesn’t exist inside certain containers by default, i.e. alpine. I got around this by specifying --user as is (previously created), and then it skipped that step confirming the user was already present:
    Error: 
       0: No such file or directory (os error 2)
    Location:
       sn_node_manager/src/service.rs:52
    
  • In my case, I am using custom builds with more feature flags particularly the metrics-port, which the current safenode-manager doesn’t seem to support. It might be good to introduce that as optional argument, or have a simple trailing catch-all option for pass-through of any misc args to the safenode pid (left on the user to decide what more they want to pass to safenode if deemed necessary) that the safenode manager isnt exposing as a wrapper argument directly as a 1:1 mapping
  • Instead of passing in a http URL for safenode.zip (custom build), I passed in a full path (local to the file system), but that error’ed out (kind of was expecting it to break here). Maybe add support for HTTP and relative or full local file system path as well for custom builds:
    Error while downloading release. Trying again 3/3: ReqwestError(reqwest::Error { kind: Builder, source: RelativeUrlWithoutBase })
    Error: 
       0: Failed to download release after 3 tries.
    Location:
       /home/runner/work/safe_network/safe_network/sn_node_manager/src/helpers.rs:46
    
  • I then tried simply giving it the ValentinesNet version, i.e. --version sn_node-v0.102.22. This resulted in an error with the http fetch (see below). It would be good if the example in the --help here stated simply the ‘0.104.22’ as the required input, and not necessarily the tag name itself (I assumed the wrong input here on my first attempt):
    Error while downloading release. Trying again 3/3: ReleaseBinaryNotFound("https://sn-node.s3.eu-west-2.amazonaws.com/safenode-sn_node-v0.104.22-x86_64-unknown-linux-musl.tar.gz")
    Error: 
       0: Failed to download release after 3 tries.
    
    Location:
       /home/runner/work/safe_network/safe_network/sn_node_manager/src/helpers.rs:46
    

Over coming some of the obstacles above, and removing features for custom RPC ports and no metrics port or use of a custom local build path, the service did register the safenode1 under the Alpine LXC:

=================================================
              Add Safenode Services              
=================================================
1 service(s) to be added
The X user already exists
Downloading safenode version 0.104.22...
Download completed
Services Added:
 ✓ safenode1
    - Safenode path: /../safe_node_vaults/safe-node-145/safenode1/safenode
    - Data path: /.../safe_node_vaults/safe-node-145/safenode1
    - Log path: /.../safe_node_logs/safe-node-145/safenode1
    - RPC port: 192.168.X.X:Y
[!] Note: newly added services have not been started

Note: The safe-node-145 that is just the naming format for the hostname of my LXC at home (spun up via terraform scripts). In my case the 145 is a unique number per floating container out there (active or stopped).

safe-node-145:/.../safe_binaries# ./safenode-manager status --details
=================================================
                Safenode Services                
=================================================
==========================
safenode1 - ADDED
==========================
Version: 0.104.22
Peer ID: -
RPC Socket: 192.168.X.X:Y
Listen Addresses: None
PID: -
Data path: /.../safe_node_vaults/safe-node-145/safenode1
Log path: /.../safe_node_logs/safe-node-145/safenode1
Bin path: /.../safe_node_vaults/safe-node-145/safenode1/safenode
Connected peers: -

I tried starting the service, but continuing to get this exec-format error for now:

safe-node-145:/.../safe_binaries# ./safenode-manager start
=================================================
             Start Safenode Services             
=================================================
Attempting to start safenode1...
Failed to start 1 service(s):
✕ safenode1:  * rc-service: Exec format error
Error: 
   0: Failed to start one or more services
Location:
   sn_node_manager/src/bin/cli/main.rs:700

The service safenode is listed under rc-service:

safe-node-145:/.../safe_binaries# rc-service --list | grep -i safenode
safenode1

The safenode CLI for the downloaded version from github seems to render the output properly:

safe-node-145:/.../safe_binaries# /.../safe_node_vaults/safe-node-145/safenode1/safenode --version
safenode cli 0.104.22

The underlying service itself looks okay to me:

[Unit]
Description=safenode1
[Service]
ExecStart=/.../safe_node_vaults/safe-node-145/safenode1/safenode --rpc 192.168.X.X:Y --root-dir /.../safe_node_vaults/safe-node-145/safenode1 --log-output-dest /.../safe_node_logs/safe-node-145/safenode1 --port X --peer /ip4/161.35.161.68/udp/50025/quic-v1/p2p/12D3KooWFNFMMJqUnvYXPuw15BjMtMb7PEeSgxzaCFTtkCtvxUpo
Restart=on-failure
User=X
KillMode=process
[Install]
WantedBy=multi-user.target

Running the ‘ExecStart’ from the shell as is, seems to work properly. Still investigating!

Overall, I haven’t made up mind if I will run more than 1 safenode pid per LXC or not, but if in the future I want to, I know I have that option. Thanks!

Thanks for the feedback. I will read this through in detail tomorrow and get back to you.

1 Like

Update:

Overlooked this, but since this is Alpine the format is not in OpenRC at /etc/init.d/safenode1 hence I think the incompatibility and Exec format error. I would have to switch distro to make use of safenode-manager, or write my own init.d scripts in OpenRC which safenode-manager can kick start via rc-service command, though at that point at least for myself, it may not make sense to use safenode-manager to do most of the setup work, if its going to conflict with the underlying file format between systemd format and openrc init system.

A quick trial run off changing the format, seems to have worked for those on a non standard or non common distro:

#!/sbin/openrc-run
  
name=$RC_SVCNAME
command="/.../safe_node_vaults/safe-node-145/safenode1/safenode"
command_args="--rpc 192.168.X.X:Y --root-dir /.../safe_node_vaults/safe-node-145/safenode1 --log-output-dest /.../safe_node_logs/safe-node-145/safenode1 --port X --peer /ip4/161.35.161.68/udp/50025/quic-v1/p2p/12D3KooWFNFMMJqUnvYXPuw15BjMtMb7PEeSgxzaCFTtkCtvxUpo"
command_user="X"
pidfile="/var/run/$RC_SVCNAME.pid"
start_stop_daemon_args=""
command_background="yes"

depend() {
        need net
}
safe-node-145:/.../safe_binaries# ./safenode-manager start
=================================================
             Start Safenode Services             
=================================================
Attempting to start safenode1...
✓ Started safenode1 service
  - Peer ID: XXXX
  - Logs: /.../safe_node_logs/safe-node-145/safenode1
safe-node-145:/.../safe_binaries# ./safenode-manager status
=================================================
                Safenode Services                
=================================================
Service Name       Peer ID                                              Status  Connected Peers
safenode1          XXXX RUNNING              13
safe-node-145:/.../safe_binaries# ./safenode-manager status --details
=================================================
                Safenode Services                
=================================================
============================
safenode1 - RUNNING
============================
Version: 0.104.22
Peer ID: XXXX
RPC Socket: 192.168.X.X:Y
Listen Addresses: Some(["/ip4/127.0.0.1/udp/X/quic-v1/p2p/XXXX", "/ip4/192.168.X.Y/udp/X/quic-v1/p2p/XXXX"])
PID: 8482
Data path: /.../safe_node_vaults/safe-node-145/safenode1
Log path: /.../safe_node_logs/safe-node-145/safenode1
Bin path: /.../safe_node_vaults/safe-node-145/safenode1/safenode
Connected peers: 5

Obviously this is not a standard container image that is supported out of the gate so no worries on this portion of the concern raised earlier above, :smiley: .

Thanks for all the info! Sorry, I promise I will get back to you on this tomorrow. I’m actually not working today.

I’ve used the node manager on Alpine on my Pi, so there is definitely some support for OpenRC. We use an external crate to generate the service definitions, and OpenRC doesn’t have as much support as Systemd, but I’ve submitted several PRs, so I’m sure we will be able to do something.

1 Like

I appreciate all your feedback. It’s really helpful.

OK, if people want to use custom ports for this, we’ll need to change the type of the argument from Ipv4Addr to SocketAddr. We should be able to do this.

Yeah thanks, I already came across this issue on my Pi. I’ve got a task to fix it.

We can add an argument for passing the metrics port. I’m not sure if that would be controlled using a feature on the node manager itself. I think I may just add the argument and use documentation to specify that the node binary would require the metrics feature.

Hmm, there is a --path argument for other commands. Not sure how I managed to miss that for add. I will provide --path as an argument there too.

Yeah thanks, I’ll get the documentation updated to that effect.

I have a Pi running Alpine:

pi4:~$ cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.19.1
PRETTY_NAME="Alpine Linux v3.19"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"

I’ve used the node manager on it, and it generates the service definition in OpenRC format:

pi4:~$ cat /etc/init.d/safenode1
#!/sbin/openrc-run

description="safenode1"
command="/var/safenode-manager/services/safenode1/safenode"
command_args="--port 12000 --rpc 127.0.0.1:54869 --root-dir /var/safenode-manager/services/safenode1 --log-output-dest /var/log/safenode/safenode1 --peer /ip4/161.35.32.56/udp/45513/quic-v1/p2p/12D3KooWNNZBnkjp9DhFvMY8w6x1QnbUC3LDdfqvkqczYzukHrhd"
pidfile="/run/${RC_SVCNAME}.pid"
command_background=true

depend() {
    provide safenode1
}

However, I just tried it on an quick Alpine VM via vagrant, which is version 3.18, and it’s generated a Systemd service definition. I’m not sure if it’s just the Alpine version, or if something else is at play, but at least I can reproduce it.

2 Likes