NAT Traversal ETA?

People,

I have finally got started with testing again but for a machine behind my router (needing port forwarding etc) I keep getting these sorts of messages:

WARN sn_node::api] get_closest query failed after network inactivity timeout - check your connection: Could not get enough peers (5) to satisfy the request, found 1

even though all the port forwarding etc tests OK. So I tried the same exercise with a Linode VM on a public IP - and got the same problem - so it looks to me like a problem with the current testnet . .

However, my Q is: Future dev is supposed to get around this NAT Traversal problem - how would that work exactly? - I need a mental picture of how that is possible . .

Thanks,
Phil.

1 Like

I’m using one Ubuntu server box on my home network with port forwards via the router so in some situations it is working with a port forward.

If you used the script I posted you do a hard restart in-between trying to start nodes not sure why but don’t think it can reuse the ports.

As I understand it, it’s waiting on the Libp2p guys to get around to properly solving the problem in the Rust implementation.

6 Likes

It works through hole punching. This is when a node can get an incoming connection from another node. This is something that works only for a specific connection and has to be set up very precisely. So, if two nodes are behind a firewall and want to connect to each other (detailed blog):

  1. Detect if we’re behind a firewall/NAT.
  2. Connect with the peer through a relay server.
  3. ‘Coordinate simultaneous dial’ by means of this relay, where both nodes connect to each other exactly at the same time on a prearranged port.

This ‘coordinate simultaneous dial’ is part of Direct Connection Upgrade through Relay (DCUtR), which is the hole punching:

We currently utilize relays, which allow us to traverse NATs by using a third party as proxy. Relays are a reliable fallback, that can connect peers behind NAT albeit with a high-latency, low-bandwidth connection. Unfortunately, they are expensive to scale and maintain if they have to carry all the NATed node traffic in the network.

It is often possible for two peers behind NAT to communicate directly by utilizing a technique called hole punching[1]. The technique relies on the two peers synchronizing and simultaneously opening connections to each other to their predicted external address. It works well for UDP, and reasonably well for TCP.

[…]

In this specification, we describe a synchronization protocol for direct connectivity with hole punching that eschews signaling servers and utilizes existing relay connections instead. That is, peers start with a relay connection and synchronize directly, without the use of a signaling server. If the hole punching attempt is successful, the peers upgrade their connection to a direct connection and they can close the relay connection. If the hole punching attempt fails, they can keep using the relay connection as they were.

So, relays already are a solution to nodes behind a NAT. They allow these nodes to be reachable via a relay server. Then, when nodes are connected through a relay, they can choose to try and connect to each other directly by hole punching (DCUtR).

8 Likes

I ran into various issues with AutoNAT in May this year (3889, #3986 and #3900), both with TCP and QUIC. These problems are supposed to be solved with a new AutoNAT version. For the implementation of AutoNATv2, there are some features that have to be implemented first.

Yesterday I noticed there is a developer that got a grant from Filecoin to work on AutoNAT version 2:

AutoNAT being about detecting NAT status, it’s the first step, and without this first step working properly we can’t do NAT traversal yet. The second step (relays) should work. The third step (hole punching/DCUtR) should work with TCP and was implemented for QUIC in June (#3964).

14 Likes

@bzee, while we are waiting for AutoNAT v2, do you think this IGD implementation will help with connection issues?

1 Like

Could I bypass this need by testing on a ZeroTier network that has remote computers (ie on the same ZT IP network but on different LANs)?

it will allow a portion of nodes behind nat to work (circa 30%) with full hole punch that number increases to over 90% though

I am not familiar with this type of network, but if there are routers who translate addresses then it will still require NAT.

4 Likes

That’s already something! Will it have effect on the TCP vs. QUIC question?

Not sure if it should? - you just run a ZT service and connect to other devices on the ZT network that you have previously created:

At least if I run my own “private” SAFE Network it seems it should “just work”?

2 Likes

Yes, it’s on my radar as a while ago I helped testing it, but ran into an issue on Linux (see here). Also, lots of routers do not have IGD enabled by default.

We’ll probably use it somewhere down the road though, because it should be really easy to use and setup.

5 Likes

AutoNATv2 seems to be several months away, if the LibP2P release cycle in the future is similar to the past.

It doesn’t seem to make it into next breaking release, 0.53, but is postponed to 0.54. Not set in stone yet, but see the linked comment in discussion here:

1 Like

Something to get excited about?

This fixes address translation for QUIC that was essentially non-existent before.

4 Likes

Hah, since November, I’ve been very frustrated to see almost nothing happening with Autonat in the corresponding PR here :face_with_diagonal_mouth::

And now I just found out, that the work has been going on all the time here :smile::

And while there’s no definite ETA yet, this comment makes me feel, that it’s not that far away anymore :dizzy::

11 Likes