I have finally got started with testing again but for a machine behind my router (needing port forwarding etc) I keep getting these sorts of messages:
WARN sn_node::api] get_closest query failed after network inactivity timeout - check your connection: Could not get enough peers (5) to satisfy the request, found 1
even though all the port forwarding etc tests OK. So I tried the same exercise with a Linode VM on a public IP - and got the same problem - so it looks to me like a problem with the current testnet . .
However, my Q is: Future dev is supposed to get around this NAT Traversal problem - how would that work exactly? - I need a mental picture of how that is possible . .
It works through hole punching. This is when a node can get an incoming connection from another node. This is something that works only for a specific connection and has to be set up very precisely. So, if two nodes are behind a firewall and want to connect to each other (detailed blog):
Detect if we’re behind a firewall/NAT.
Connect with the peer through a relay server.
‘Coordinate simultaneous dial’ by means of this relay, where both nodes connect to each other exactly at the same time on a prearranged port.
We currently utilize relays, which allow us to traverse NATs by using a third party as proxy. Relays are a reliable fallback, that can connect peers behind NAT albeit with a high-latency, low-bandwidth connection. Unfortunately, they are expensive to scale and maintain if they have to carry all the NATed node traffic in the network.
It is often possible for two peers behind NAT to communicate directly by utilizing a technique called hole punching[1]. The technique relies on the two peers synchronizing and simultaneously opening connections to each other to their predicted external address. It works well for UDP, and reasonably well for TCP.
[…]
In this specification, we describe a synchronization protocol for direct connectivity with hole punching that eschews signaling servers and utilizes existing relay connections instead. That is, peers start with a relay connection and synchronize directly, without the use of a signaling server. If the hole punching attempt is successful, the peers upgrade their connection to a direct connection and they can close the relay connection. If the hole punching attempt fails, they can keep using the relay connection as they were.
So, relays already are a solution to nodes behind a NAT. They allow these nodes to be reachable via a relay server. Then, when nodes are connected through a relay, they can choose to try and connect to each other directly by hole punching (DCUtR).
I ran into various issues with AutoNAT in May this year (3889, #3986 and #3900), both with TCP and QUIC. These problems are supposed to be solved with a new AutoNAT version. For the implementation of AutoNATv2, there are some features that have to be implemented first.
Yesterday I noticed there is a developer that got a grant from Filecoin to work on AutoNAT version 2:
AutoNAT being about detecting NAT status, it’s the first step, and without this first step working properly we can’t do NAT traversal yet. The second step (relays) should work. The third step (hole punching/DCUtR) should work with TCP and was implemented for QUIC in June (#3964).
Yes, it’s on my radar as a while ago I helped testing it, but ran into an issue on Linux (see here). Also, lots of routers do not have IGD enabled by default.
We’ll probably use it somewhere down the road though, because it should be really easy to use and setup.
AutoNATv2 seems to be several months away, if the LibP2P release cycle in the future is similar to the past.
It doesn’t seem to make it into next breaking release, 0.53, but is postponed to 0.54. Not set in stone yet, but see the linked comment in discussion here: