You will need to install the safe
client for that. Have a look at the instructions in the first post. If you are still struggling, let us know.
If you’re using the same powershell terminal as when you installed everything (didn’t close and re-open it) the command path isn’t loaded in yet. You’ll need to either close/reopen the terminal or just open a new one altogether if you’ve got stuff you want to see already on it.
I am trying to understand the networking IP <=> Port combination that I am seeing associated with my WAN IPs in the logs (something seems bizarre):
1 LXC - 50 safenode pids:
The ports provided is 12000 to 12049 for these 50 safenode pids. These are also NAT port forwarded on the router. The logs show the safenode pids listening on say the 1st port (12000), and its external address path is also valid:
[2024-03-28T19:24:45.053090Z INFO sn_networking::event] Local node is listening on "/ip4/192.168.X.Y/udp/12000/quic-v1/p2p/12D3KooWSXEK3AAGjfczYataxPNYku7J6z9BpfzEjj7JEyEctakS"
external address: new candidate address=/ip4/A.B.C.D/udp/12000/quic-v1 <- valid public IP here as well.
Yet I see all these transport errors on ‘ports’ that are not even configured as a listener’s port on safenode for 1 or more off the other 199 safenode pids that this 1st safenode pid is attempting to talk with:
[2024-03-28T19:42:45.537740Z WARN sn_networking::event] OutgoingConnectionError to PeerId("12D3KooW9zmzaJCx4JdyYCyWtHXET8trWgUhv6cp8sW1UfhoJyiF") on ConnectionId(4663) - Transport([("/ip4/A.B.C.D/udp/57447/quic-v1/p2p/12D3KooW9zmzaJCx4JdyYCyWtHXET8trWgUhv6cp8sW1UfhoJyiF", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("/ip4/A.B.C.D/udp/12006/quic-v1/p2p/12D3KooW9zmzaJCx4JdyYCyWtHXET8trWgUhv6cp8sW1UfhoJyiF", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } }))])
[2024-03-28T19:46:20.428522Z WARN sn_networking::event] OutgoingConnectionError to PeerId("12D3KooW9zmzaJCx4JdyYCyWtHXET8trWgUhv6cp8sW1UfhoJyiF") on ConnectionId(5117) - Transport([("/ip4/A.B.C.D/udp/12006/quic-v1/p2p/12D3KooW9zmzaJCx4JdyYCyWtHXET8trWgUhv6cp8sW1UfhoJyiF", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("/ip4/A.B.C.D/udp/57447/quic-v1/p2p/12D3KooW9zmzaJCx4JdyYCyWtHXET8trWgUhv6cp8sW1UfhoJyiF", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } }))])
For the 200 nodes, the port range was 12000 to 12199 (configured as input to safenode pid, and on the router).
Peer ID - 12D3KooW9zmzaJCx4JdyYCyWtHXET8trWgUhv6cp8sW1UfhoJyiF
was specifically bootstrapped to port 12006 (safenode7 service):
[2024-03-28T19:25:11.731232Z INFO sn_networking::event] Local node is listening on "/ip4/192.168.X.Y/udp/12006/quic-v1/p2p/12D3KooW9zmzaJCx4JdyYCyWtHXET8trWgUhv6cp8sW1UfhoJyiF"
[2024-03-28T19:25:12.014872Z INFO sn_networking::event] external address: new candidate address=/ip4/A.B.C.D/udp/12006/quic-v1 <- valid Public IP
From where is it auto-negotiating or decided to using port 57447 for the Peer ID above for safenode7 service (peer ID ending in JyiF)? (or for that matter any port outside of 12000 to 12199 against my own WAN IPs).
Clearly, port 57447 isn’t open on the WAN IP here… and maybe thats why its just failing with transport errors (across the board).
Two connection Id generated from safenode1 service pid trying to talk with safenode7 service pid (running under 1 container itself):
ConnectionId 4663 - UDP - 57447 - 12D3KooW9zmzaJCx4JdyYCyWtHXET8trWgUhv6cp8sW1UfhoJyiF
ConnectionId 5117 - UDP - 12006 - 12D3KooW9zmzaJCx4JdyYCyWtHXET8trWgUhv6cp8sW1UfhoJyiF
Am I missing something obvious here? Any feedback would be appreciated.
Still poking at the logs and internal configurations.
Just a very quick check. Those other port numbers are definitely not the RPC ports?
Absolutely, never configured them as inputs, not on the router nor fed as --port parameters to the safenode pid via safenode-manager.
I double checked ps -ef | grep safenode, and the --ports are all 12000 to 12049 for the 1st LXC for its 50 safenodes pids.
The ports for RPC and Metrics are in the 13,000+ and 14,000+ range for me. 57447 seems like a total anomaly (among other random ports I am seeing).
Not sure what is going on. Sorry.
was just trying an experiment.
I have 10 vps on the current network s00 - s10
s02 was just started 40 nodes short time out of 61s and all my instances have now jumped from 10%cpu to as shown in the picture bellow.
is this some kind of proof of my long standing theory that to many nodes joining can crash a testnet ?
ping @joshuef ??
-------------- edit
this is how they are looking most of the time
A big download took our wifi down again. My node went offline in the same event, but after rebooting the router, it came back to life.
just tried setting up first client and node, per instructions above, it choked on this… Installing safenode-manager version 0.7.2…
Error: Release binary https://sn-node-manager.s3.eu-west-2.amazonaws.com/safenode-manager-0.7.2-x86_64-pc-windows-msvc.tar.gz was not found
node and client installed ok
Yeah sorry, some people pointed that out earlier. There was a problem with the release process which meant the binaries for the node manager didn’t go up correctly.
No problem Chris, anyone else seeing a lot of transport or handshake timeouts?
I do see it showing multiple ports for a single Peer ID Url (how bizarre):
i.e. 3981, 50765, 8021, 12008, 64204, 14265, 37799
12008 is the one I configured as the starting input, yet it thinks the others are also associated with this peer id:
[2024-03-28T21:08:08.024095Z WARN sn_networking::event] OutgoingConnectionError to PeerId("12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs") on ConnectionId(20489) - Transport(
[("
/ip4/A.B.C.D/udp/3981/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("
/ip4/A.B.C.D/udp/50765/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("
/ip4/A.B.C.D/udp/8021/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("
/ip4/A.B.C.D/udp/12008/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("
/ip4/A.B.C.D/udp/64204/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("
/ip4/A.B.C.D/udp/14265/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("
/ip4/A.B.C.D/udp/37799/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } }))
])
The logs show the external candidate address detected is changing rapidly during the lifetime of a single pid (oh man… aaaah!! I will need to dig into the reason why…)). This is definitely not expected:
[2024-03-28T19:25:22.270606Z INFO sn_networking::event] external address: new candidate address=/ip4/A.B.C.D/udp/12008/quic-v1
[2024-03-28T19:25:23.483660Z INFO sn_networking::event] external address: new candidate address=/ip4/A.B.C.D/udp/12008/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs
[2024-03-28T19:25:23.717794Z INFO sn_networking::event] external address: new candidate address=/ip4/A.B.C.D/udp/3981/quic-v1
[2024-03-28T19:25:24.490792Z INFO sn_networking::event] external address: new candidate address=/ip4/A.B.C.D/udp/8021/quic-v1
[2024-03-28T19:25:24.518781Z INFO sn_networking::event] external address: new candidate address=/ip4/A.B.C.D/udp/50765/quic-v1
[2024-03-28T19:26:47.904880Z INFO sn_networking::event] external address: new candidate address=/ip4/A.B.C.D/udp/37799/quic-v1
[2024-03-28T19:27:45.331681Z INFO sn_networking::event] external address: new candidate address=/ip4/A.B.C.D/udp/14265/quic-v1
[2024-03-28T19:27:50.334862Z INFO sn_networking::event] external address: new candidate address=/ip4/A.B.C.D/udp/64204/quic-v1
[2024-03-28T19:33:16.528309Z INFO sn_networking::event] external address: new candidate address=/ip4/A.B.C.D/udp/14265/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs
[2024-03-28T19:37:19.949118Z INFO sn_networking::event] external address: new candidate address=/ip4/A.B.C.D/udp/3981/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs
[2024-03-28T20:08:04.169718Z INFO sn_networking::event] external address: new candidate address=/ip4/A.B.C.D/udp/64204/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs
I am definitely causing more chaos on the network and ‘handshaketimeouts’ for others (hopefully the blocking of bad peer node detection system is kicking in nicely with other folks’ nodes against my ‘bad nodes’ .
I have the same feeling like sometimes we like @Southside are the problem and not the solution
Yes, I have following marks in my logs:
Cleaning out peer PeerId("12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs")
The whole snippet about it happening:
Summary
[2024-03-28T20:39:02.510078Z WARN sn_networking::event] OutgoingConnectionError to PeerId("12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs") on ConnectionId(60261) - Transport([("/ip4/99.43.124.25/udp/50765/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("/ip4/99.43.124.25/udp/8021/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("/ip4/99.43.124.25/udp/14265/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("/ip4/99.43.124.25/udp/64204/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("/ip4/99.43.124.25/udp/37799/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("/ip4/99.43.124.25/udp/3981/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })), ("/ip4/99.43.124.25/udp/12008/quic-v1/p2p/12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } }))])
[2024-03-28T20:39:02.510604Z ERROR sn_networking::event] Dial errors len : 7
[2024-03-28T20:39:02.510619Z ERROR sn_networking::event] OutgoingTransport error : Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })
[2024-03-28T20:39:02.510655Z WARN sn_networking::event] Problematic error encountered: Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } }
[2024-03-28T20:39:02.510674Z ERROR sn_networking::event] OutgoingTransport error : Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })
[2024-03-28T20:39:02.510699Z WARN sn_networking::event] Problematic error encountered: Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } }
[2024-03-28T20:39:02.510714Z ERROR sn_networking::event] OutgoingTransport error : Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })
[2024-03-28T20:39:02.510740Z WARN sn_networking::event] Problematic error encountered: Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } }
[2024-03-28T20:39:02.510753Z ERROR sn_networking::event] OutgoingTransport error : Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })
[2024-03-28T20:39:02.510775Z WARN sn_networking::event] Problematic error encountered: Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } }
[2024-03-28T20:39:02.510790Z ERROR sn_networking::event] OutgoingTransport error : Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })
[2024-03-28T20:39:02.510817Z WARN sn_networking::event] Problematic error encountered: Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } }
[2024-03-28T20:39:02.510829Z ERROR sn_networking::event] OutgoingTransport error : Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })
[2024-03-28T20:39:02.510854Z WARN sn_networking::event] Problematic error encountered: Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } }
[2024-03-28T20:39:02.510867Z ERROR sn_networking::event] OutgoingTransport error : Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } })
[2024-03-28T20:39:02.510890Z WARN sn_networking::event] Problematic error encountered: Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } }
[2024-03-28T20:39:02.510908Z WARN sn_networking::event] Cleaning out peer PeerId("12D3KooWPMpYUsdVS5txq4jKyTVkMgEvditEMdidjJgrdcWG48Gs")
I am glad I am like the only node operator getting extensively banned by the network for misbehaving lol.
Glad to have helped test out the bad node detection feature , hahah .
Now… to go about actually fixing this issue on my end so I can stop being marked as a bad peer, .
But it is interesting that your bad node was just “Cleaned out…” and never “Removed from routing table”.
Because one previous bad node got one more mark after the “Cleaning…”, namely this:
PeerRemovedFromRoutingTable(PeerId("12D3KooWCZCrb3CqsBhPFw8FPv7RZwzpvFXgX6G4PPb3LTMr9FFj"))
My safe node pids seem to be rotating <constant IP><different Port><constant Peer ID>
as far as the ‘external’ address message is being relayed in the p2p communication…
This turns into 100s of combinations to mark bad, and not to mention a possible amplification here… from 100s of peer id (safe node pids) that are already running causing slowdown for the network (maybe)?
Hey @Shu! Great digging there, I think I know where that might be coming from. Currently, when a node makes a connection with us, it will also tell us what they think is our listen address
. And we are blindly considering those listen address to be ours, and adding them to our list.
Maybe we should not be adding those listen addresses. I’m not sure how another node might see us listening at a different port, but this might explain the strange logs that you have there. I will check what’s going on there.
As far as I can tell, no new non 12000 to 12199 UDP ports are being opened on my end (from the LXC itself), but this external address for libp2p associated with my WAN IP is definitely in-flux as per the logs from safenode p2p communication.
I have to double check if the mapping on LAN <=> WAN is not holding static… though the 1st port spin up it does identify the external address properly and the port (input port I gave to the pid).
I didn’t fully follow your statement above… but yeah. Maybe you are alluding to constantly adding URLs associated with my 1 single safenode pid (as I see in the messages (they are all being added and bucketed as part of some array, and keep growing as the external address somehow is being changed )).
Aha!, I had that very same issue yesterday morning. @roland fired in a fix and it’s out in 0.90.2. (I’ll update the op to reflet that)
Did you have success before you had errors? It’s sounding like wallet got out of sync…
At the moment, that’s not in there I think @josh. It should be coming back soon. We’ve a PR waiting in the wings that reuses this should get cost reflecting price more.
So, I think I was off (if I said that recently), as it has been simplified to be just record count
. But that’s not as representative of the actual close files your nodes are responsible for, so we’ll be looking to bring that back