Wanted to note here, I am seeing non public IPs being carried over or conveyed to remote peers… this is triggering MultiAddr not supported messages, safe to ignore or possibly a concern?
safe-node-155-0:# cat safenode.log | grep 12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b
[2024-04-02T03:11:22.238562Z INFO sn_networking::event] received identify info from undialed peer for not full kbucket Some(254), dail back to confirm external accesable peer_id=12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b addrs={"/ip4/65.108.236.166/udp/10425/quic-v1"}
[2024-04-02T03:11:22.464108Z INFO sn_networking::event] New peer added to routing table: PeerId("12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b"), now we have #191 connected peers
[2024-04-02T03:11:22.464128Z INFO sn_networking::event] Peer PeerId("12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b") has a Some(254) distance to us
[2024-04-02T03:11:22.464158Z INFO sn_networking::event] kad_event::RoutingUpdated 191: PeerId("12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b"), is_new_peer: true old_peer: None
[2024-04-02T03:11:22.464272Z INFO sn_node::log_markers] PeerAddedToRoutingTable(PeerId("12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b"))
[2024-04-02T03:16:28.808829Z WARN sn_networking::event] OutgoingConnectionError to PeerId("12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b") on ConnectionId(1078) - Transport([("/ip4/127.0.0.1/udp/10425/quic-v1/p2p/12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b", MultiaddrNotSupported("/ip4/127.0.0.1/udp/10425/quic-v1/p2p/12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b")), ("/ip4/10.0.0.1/udp/10425/quic-v1/p2p/12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b", MultiaddrNotSupported("/ip4/10.0.0.1/udp/10425/quic-v1/p2p/12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b")), ("/ip4/192.168.9.1/udp/10425/quic-v1/p2p/12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b", MultiaddrNotSupported("/ip4/192.168.9.1/udp/10425/quic-v1/p2p/12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b")), ("/ip4/65.108.236.166/udp/10425/quic-v1/p2p/12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b", Other(Custom { kind: Other, error: Custom { kind: Other, error: HandshakeTimedOut } }))])
[2024-04-02T03:16:28.808885Z ERROR sn_networking::event] OutgoingTransport error : MultiaddrNotSupported("/ip4/127.0.0.1/udp/10425/quic-v1/p2p/12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b")
[2024-04-02T03:16:28.808892Z WARN sn_networking::event] Multiaddr not supported : "/ip4/127.0.0.1/udp/10425/quic-v1/p2p/12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b"
[2024-04-02T03:16:28.808900Z ERROR sn_networking::event] OutgoingTransport error : MultiaddrNotSupported("/ip4/10.0.0.1/udp/10425/quic-v1/p2p/12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b")
[2024-04-02T03:16:28.808907Z WARN sn_networking::event] Multiaddr not supported : "/ip4/10.0.0.1/udp/10425/quic-v1/p2p/12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b"
[2024-04-02T03:16:28.808913Z ERROR sn_networking::event] OutgoingTransport error : MultiaddrNotSupported("/ip4/192.168.9.1/udp/10425/quic-v1/p2p/12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b")
[2024-04-02T03:16:28.808920Z WARN sn_networking::event] Multiaddr not supported : "/ip4/192.168.9.1/udp/10425/quic-v1/p2p/12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b"
[2024-04-02T03:16:28.808932Z WARN sn_networking::event] Cleaning out peer PeerId("12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b")
[2024-04-02T03:16:28.808948Z INFO sn_networking::event] Peer PeerId("12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b") has a Some(254) distance to us
[2024-04-02T03:16:28.809474Z INFO sn_node::log_markers] PeerRemovedFromRoutingTable(PeerId("12D3KooWEScVfEESAf5dYKin3Ss1aMnHe2RAJNa4H85wro4FEF8b"))
For the Peer ID above, the node must have started with a binding on ::10425 port on all adapters.
127.0.0.1 ← INTERNAL
10.0.0.1 ← INTERNAL
192.168.9.1 ← INTERNAL
65.108.236.166 ← WAN IP
Is some of this HandshakeTimedOut also getting confused between the multiple addresses being delivered in the payload? Could it be a different issue, if so, should one probably scrub the internal IP/port that are reserved for LAN segments only, unless offcourse libp2p is intelligently is picking the right ip/port/peer id url to send outbound here? If so, never-mind …
Eventually in the example above, the peer was added only for it to be removed within 6 seconds.
Also, FWIW, I have up’ed my socket send/receive buffers size on a single Linux hosts as per this snippet of text found on the internet (not sure if the scenario below applies to libp2p with quic):
Experiments have shown that QUIC transfers on high-bandwidth connections can be limited by the size of the UDP receive and send buffer. The receive buffer holds packets that have been received by the kernel, but not yet read by the application (quic-go in this case). The send buffer holds packets that have been sent by quic-go, but not sent out by the kernel. In both cases, once these buffers fill up, the kernel will drop any new incoming packet.
sysctl -w net.core.rmem_max=2500000
sysctl -w net.core.wmem_max=2500000
I am experimenting with the wrapper startup script that continues to monitor the CPU so it stays less than 50% (4 core LXC), before triggering the next safenode pid to launch… as arbitrarily guessing different intervals wasn’t ideal to smooth the cpu usage curve when spinning up 50+ safenode pids. I felt it needed to take CPU load into account so to not max out the cpu.
I suspect some of the failed bootstraps might be due to CPU load with too many safenode pids initially ramping up or a QUIC related issue too… (no firm data here to back that theory up atm), but wanted to have a graceful ramp up in as timely off a manner as possible so pivoted based on CPU load levels for now instead off --interval.
No scientific results trial done on the above, but FWIW, with the above changes, 47 out of the 49 safenodes that were running did bootstrap properly (along with the DialPeerConditionFalse and continued HandshakeTimedout messages). It still was a lot better success rate on bootstrap than before.