The node needs to be started with this added: --rpc 127.0.0.1:<some_port>
The call will then become: grpcurl -plaintext -proto safenode.proto 127.0.0.1:<some_port> safenode_proto.SafeNode/KBuckets
You can use: NodeInfo, NetworkInfo, NodeEvents, RecordAddresses, KBuckets, Stop, Restart, Update, UpdateZLogLevel. as described in the safenode_proto file.
Done! Plus more: snnm v0.2.0, a.k.a the āneo editionā in your honor, is in my repo now.
As usual it is experimental, use-at-your-own-risk, and takes some effort to get setup but it works great for me.
Itās able to: -e: flag to enable rpc on nodes when starting, restarting, or exchanging node processes -L: the rpc enabled but slower cousin of -l to list all nodes incl. actual peer count and uptime
The rpc port number is derived from the node port: node_port - NODE_BASE_PORT + RPC_BASE_PORT.
sample output: | port 5xxxx | pid 271192 | peer-id 12D3xxxx | v."0.108.2", 29 peers, uptime "43483" s
Installation of grpcurl and friends (for installation into your $HOME directory):
Nah im convinced something is backwards here. How am I suppose to know an incoming port from another user when everyone is setting their own range? I only know what im setting. Not what incoming connections will be? What am I missing lol.
snnm version 0.2.9 has this kind of output. All you need to know. connected peers real-time, number of records, and forwarded earnings. Plus totals for each server.
snnm -a 53344 -b 53349 -L
..................................................
| port 53344 | pid 312230 | peer-id 12D3KooW..1 | v."0.108.2" | 33 peers | 99 records | "18728"s uptime |
| port 53345 | pid 315238 | peer-id 12D3KooW..2 | v."0.108.2" | 258 peers | 100 records | "18589"s uptime |
| port 53346 | pid 299563 | peer-id 12D3KooW..3 | v."0.108.2" | 247 peers | 131 records | "19282"s uptime | fwd_nanos: 20
| port 53347 | pid 317531 | peer-id 12D3KooW..4 | v."0.108.2" | 28 peers | 133 records | "18477"s uptime | fwd_nanos: 10
| port 53348 | pid 307574 | peer-id 12D3KooW..5 | v."0.108.2" | 58 peers | 106 records | "18891"s uptime | fwd_nanos: 10
| port 53349 | pid 301291 | peer-id 12D3KooW..6 | v."0.108.2" | 83 peers | 116 records | "19200"s uptime | fwd_nanos: 10
Total connected peers (for nodes with rpc port only): 707
Total records (for nodes with rpc port only): 685
Total forwarded nanos: 50
snnm execution is done. In some cases you need to [Ctrl]+[c] to get back to your command prompt.
Ok I think im well confused. So ive reset everything. safenetnode-manager reset then I recreated my node list with safenode-manager add --owner .user --home-network --count 20 --node-port 21000-21019 --rpc-port 31000-31019 --peer /ip4/46.101.80.187/udp/58070/quic-v1/p2p/12D3KooWKgJQedzCxrp33u3dBD1mUZ9HTjEjgrxskEBvzoQWkRT9 - FYI ive tried this with and without the --home-network which should not be needed.
My router is setup exactly the same way to access other services⦠unsure best way to test each layer, but now im getting 0 connections on, anything at any point.
the fixes are supposed to remove the need for the peer option and that peer probably doesnāt exist anymore anyhow
And i hope that .user is changed to your discordID and not the user option
home-network doesnāt need the ports specified as they really are not listening anyhow. The connection is done via normal communications where the node communicates with relay first opening the connection pathway.
Only need the ports when not specifying home-network and port forwarding
The peer option is prob the biggest problem here, with the update those old peers will be gone
Good lord split me. I had a inclination it was the peer not existing lol. Yeah I figured the other stuff was the case. And yes, That is the case. Thanks very much il wait for this update then. Excellent.
snnm is now v0.3.2. Versions 0.3.x use a new approach to significantly increase speed and to use fewer resources. No more lsof and netstat!
edit: Jumped once more, to v0.4.x now, with metrics server support addedā¦
Link to my development branch for the latest version: snnm
To install it on Rocky Linux 9.4 (not tested on other distributions yet):
cd "$HOME"
rm snnm
wget https://raw.githubusercontent.com/drirmbda/node-toolbox/drirmbda-dev/snnm -O snnm
echo "WARNING: snnm is experimental software, only for people who know what they are doing. Review the code before using at your own responsibility."
chmod +x snnm
wget https://github.com/fullstorydev/grpcurl/releases/download/v1.9.1/grpcurl_1.9.1_linux_386.rpm
sudo rpm -i grpcurl_1.9.1_linux_386.rpm
sudo dnf -y install grpcurl jq
wget https://raw.githubusercontent.com/maidsafe/safe_network/main/sn_protocol/src/safenode_proto/safenode.proto -O safenode.proto
wget https://raw.githubusercontent.com/maidsafe/safe_network/main/sn_protocol/src/safenode_proto/req_resp_types.proto -O req_resp_types.proto
Note: Sometimes, when I commit changes, I forget to set back the default owner value to OWNER_DISCORD_ID=ānoneā on line 6, to run nodes without setting any Discord ID. You can also change it to your own ID to avoid needing to specify it every time using -d.
Adding --metrics-server-port <portnumber> at node start enables the metrics server. You can then grab a snapshot using wget for example and grep something interesting.
We have logs, rpc, metrics server. Now, what would be a good script to assess the node health based on a combination of factors, summarized into an āOKā stamp of approval?
Cap the number of running nodes while you are starting nodes.
Obviously not so good when you are using safenode-manager or launchpad, but works great with snnm. (As with anything using the kill command, test and review if this will work for you.)
Rocky 9.4, bash
TARGET=50 # allowed number of nodes running at any time
while true
do
[ $(ps -A | grep safenode | wc -l) -gt "$TARGET" ] && kill $(ps -A | grep safenode | head -2 | grep -o "^[ ,0-9]*")
ps -A | grep safenode | wc -l ; ps -A | grep safenode | grep def | wc -l
sleep 2
done
Get list of active IP addresses from which a node is receiving data by safenode PID.
I found a way to get active connections or connection counts without using costly RPC calls. To get accurate values the time to monitor a port should be 30 seconds, but 0.5 s is good enough to get a snapshot so that the reported values are a lower limit. This way we can assess if the node is connected.
This is integrated in the snnm tool (v0.4.9+), which is frankly getting quite ahead of the capabilities of official tools and is also more transparent, easier to debug, to tweak, and it scales really well. But of course snnm is limited to RHEL OSes, or at least remains untested on other distributions.
Identify nodes without good peers to contact, thus needing a restart.
This approach uses the log output only and probably is more reliable than counting connections.
Rocky Linux 9.4, bash
#The following is run from the logs directory of a node
cat "safenode.log" | grep -e "Skip bad_nodes check" -e "Performing"
#If the last item contains "Skip" then the number of nodes "in the RT" is "too small" and we should do a new bootstrap. A bootstrap shows up in the log as "Performing" a bootsrap.
This is incorporated in snnm v0.5.0+, which also can launch vdash by specifying a port number instead of a path and many more improvements.
snnm v0.5.3 includes an improved health check and it prints a reassuring āseems okayā for nodes that appear to be happy. This still needs improvementā¦
Instead of using systemd and monitoring, I run this periodically: