NAT Traversal ETA?

philip_rhoades · September 27, 2023, 7:40pm

People,

I have finally got started with testing again but for a machine behind my router (needing port forwarding etc) I keep getting these sorts of messages:

WARN sn_node::api] get_closest query failed after network inactivity timeout - check your connection: Could not get enough peers (5) to satisfy the request, found 1

even though all the port forwarding etc tests OK. So I tried the same exercise with a Linode VM on a public IP - and got the same problem - so it looks to me like a problem with the current testnet . .

However, my Q is: Future dev is supposed to get around this NAT Traversal problem - how would that work exactly? - I need a mental picture of how that is possible . .

Thanks,
Phil.

aatonnomicc · September 27, 2023, 7:47pm

I’m using one Ubuntu server box on my home network with port forwards via the router so in some situations it is working with a port forward.

If you used the script I posted you do a hard restart in-between trying to start nodes not sure why but don’t think it can reuse the ports.

JPL · September 27, 2023, 7:47pm

As I understand it, it’s waiting on the Libp2p guys to get around to properly solving the problem in the Rust implementation.

bzee · September 28, 2023, 6:08am

It works through hole punching. This is when a node can get an incoming connection from another node. This is something that works only for a specific connection and has to be set up very precisely. So, if two nodes are behind a firewall and want to connect to each other (detailed blog):

Detect if we’re behind a firewall/NAT.
Connect with the peer through a relay server.
‘Coordinate simultaneous dial’ by means of this relay, where both nodes connect to each other exactly at the same time on a prearranged port.

This ‘coordinate simultaneous dial’ is part of Direct Connection Upgrade through Relay (DCUtR), which is the hole punching:

We currently utilize relays, which allow us to traverse NATs by using a third party as proxy. Relays are a reliable fallback, that can connect peers behind NAT albeit with a high-latency, low-bandwidth connection. Unfortunately, they are expensive to scale and maintain if they have to carry all the NATed node traffic in the network.

It is often possible for two peers behind NAT to communicate directly by utilizing a technique called hole punching^[1]. The technique relies on the two peers synchronizing and simultaneously opening connections to each other to their predicted external address. It works well for UDP, and reasonably well for TCP.

[…]

In this specification, we describe a synchronization protocol for direct connectivity with hole punching that eschews signaling servers and utilizes existing relay connections instead. That is, peers start with a relay connection and synchronize directly, without the use of a signaling server. If the hole punching attempt is successful, the peers upgrade their connection to a direct connection and they can close the relay connection. If the hole punching attempt fails, they can keep using the relay connection as they were.

So, relays already are a solution to nodes behind a NAT. They allow these nodes to be reachable via a relay server. Then, when nodes are connected through a relay, they can choose to try and connect to each other directly by hole punching (DCUtR).

bzee · September 28, 2023, 6:22am

I ran into various issues with AutoNAT in May this year (3889, #3986 and #3900), both with TCP and QUIC. These problems are supposed to be solved with a new AutoNAT version. For the implementation of AutoNATv2, there are some features that have to be implemented first.

Yesterday I noticed there is a developer that got a grant from Filecoin to work on AutoNAT version 2:

github.com/libp2p/rust-libp2p

AutoNAT v2

opened 02:02AM - 20 Sep 23 UTC

thomaseizinger

tracking-issue

Preliminary tracking issue around all the work that needs to happen around AutoN…ATv2 or issues that are associated with it. ```[tasklist] ### Implementation tasks - [ ] https://github.com/libp2p/rust-libp2p/issues/4226 ``` ```[tasklist] ### Follow-up work - [ ] https://github.com/libp2p/rust-libp2p/issues/3953 ``` ```[tasklist] ### Known bugs - [ ] https://github.com/libp2p/rust-libp2p/issues/3986 - [ ] https://github.com/libp2p/rust-libp2p/issues/3889 - [ ] https://github.com/libp2p/rust-libp2p/issues/3308 - [ ] https://github.com/libp2p/rust-libp2p/issues/4873 ```

AutoNAT being about detecting NAT status, it’s the first step, and without this first step working properly we can’t do NAT traversal yet. The second step (relays) should work. The third step (hole punching/DCUtR) should work with TCP and was implemented for QUIC in June (#3964).

Toivo · September 28, 2023, 7:29am

@bzee, while we are waiting for AutoNAT v2, do you think this IGD implementation will help with connection issues?

github.com/libp2p/rust-libp2p

feat(upnp): add implementation based on IGD protocol

libp2p:master ← jxs:add-upnp-protocol

opened 03:34PM - 04 Jul 23 UTC

jxs

+989 -1

## Description Implements UPnP via the IGD protocol. The usage of IGD is an i…mplementation detail and is planned to be extended to support NATpnp. Resolves: #3903. ## Notes & open questions As discussed, this hasn't yet any type of tests, since we need a network with a gateway with igd support. Feel free to suggest any ideas for testing.  ## Change checklist - [x] I have performed a self-review of my own code - [x] I have made corresponding changes to the documentation - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] A changelog entry has been made in the appropriate crates

philip_rhoades · September 28, 2023, 8:04am

Could I bypass this need by testing on a ZeroTier network that has remote computers (ie on the same ZT IP network but on different LANs)?

dirvine · September 28, 2023, 8:09am

it will allow a portion of nodes behind nat to work (circa 30%) with full hole punch that number increases to over 90% though

I am not familiar with this type of network, but if there are routers who translate addresses then it will still require NAT.

Toivo · September 28, 2023, 8:13am

That’s already something! Will it have effect on the TCP vs. QUIC question?

philip_rhoades · September 28, 2023, 8:16am

Not sure if it should? - you just run a ZT service and connect to other devices on the ZT network that you have previously created:

At least if I run my own “private” SAFE Network it seems it should “just work”?

bzee · September 28, 2023, 9:43am

Yes, it’s on my radar as a while ago I helped testing it, but ran into an issue on Linux (see here). Also, lots of routers do not have IGD enabled by default.

We’ll probably use it somewhere down the road though, because it should be really easy to use and setup.

Toivo · October 30, 2023, 6:46am

AutoNATv2 seems to be several months away, if the LibP2P release cycle in the future is similar to the past.

It doesn’t seem to make it into next breaking release, 0.53, but is postponed to 0.54. Not set in stone yet, but see the linked comment in discussion here:

github.com/libp2p/rust-libp2p

`v0.53` release

opened 05:04AM - 24 Sep 23 UTC

thomaseizinger

tracking-issue

Tracking issue for the upcoming `v0.53` release to make sure we don't forget any…thing. ```[tasklist] ### PRs to include in final patch release - [ ] https://github.com/libp2p/rust-libp2p/pull/4559 - [ ] https://github.com/libp2p/rust-libp2p/pull/4569 - [ ] https://github.com/libp2p/rust-libp2p/pull/4102 - [ ] https://github.com/libp2p/rust-libp2p/pull/4029 - [ ] https://github.com/libp2p/rust-libp2p/pull/4645 - [ ] https://github.com/libp2p/rust-libp2p/pull/4656 - [ ] https://github.com/libp2p/rust-libp2p/pull/4672 - [ ] https://github.com/libp2p/rust-libp2p/pull/4349 ``` ```[tasklist] - [ ] Mention transition to `tracing` in release notes on GitHub - [ ] Mention API improvements like `PollParameters` & `FromSwarm` - [ ] https://github.com/libp2p/rust-libp2p/milestone/7 - [ ] Remove all deprecated items - [ ] Mention more aggressive KeepAlive ```

Toivo · November 19, 2023, 5:43pm

Something to get excited about?

This fixes address translation for QUIC that was essentially non-existent before.

github.com/libp2p/rust-libp2p

fix(quic): fix address translation

libp2p:master ← nazar-pc:fix-quic-address-translation

opened 09:39AM - 19 Nov 23 UTC

nazar-pc

+62 -4

## Description This fixes address translation for QUIC that was essentially n…on-existent before. ## Notes & open questions Test is analogous to corresponding test in TCP protocol implementation. I have added test for both async-std and tokio, even though compiling crate with just tokio feature enabled causes a lot of compilation warnings. What I'm not sure about is whether old behavior is intentional, but to be it seemed like a bug that caused major issues. ## Change checklist - [x] I have performed a self-review of my own code - [ ] I have made corresponding changes to the documentation - [x] I have added tests that prove my fix is effective or that my feature works - [x] A changelog entry has been made in the appropriate crates

Toivo · February 12, 2024, 11:31am

Hah, since November, I’ve been very frustrated to see almost nothing happening with Autonat in the corresponding PR here :

github.com/libp2p/rust-libp2p

refactor(core, swarm): Transport redesign

libp2p:master ← umgefahren:transport-redesign

opened 09:20PM - 27 Sep 23 UTC

umgefahren

+456 -499

## Description First implementation of the redesign of the Transport trait. T…his should not be merged, but instead a place where the proposed changes are discussed. This PR belongs to #4226. I have implemented the trait, according to what has been discussed in the https://github.com/libp2p/rust-libp2p/issues/4226#issuecomment-1737221327. ## Notes & open questions What do you think of the implementation? ## Change checklist - [ ] I have performed a self-review of my own code - [ ] I have made corresponding changes to the documentation - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] A changelog entry has been made in the appropriate crates

And now I just found out, that the work has been going on all the time here :

github.com/umgefahren/rust-libp2p

feat(autonatv2): Implement autonat v2

umgefahren:transport-redesign ← umgefahren:implement-autonat-v2

opened 04:08PM - 07 Nov 23 UTC

umgefahren

+3829 -378

## Description This is a WIP implementation of AutoNATv2. I would appreci…ate feedback by @mxinden and @thomaseizinger. You don't have to go into any specifics, I just wanted to check up if the direction is correct. A complete and formal PR will follow once everything is done. - 🚧 Client - ConnectionHandler (~80% done) - 🚧 Client - Behavior (~10% done) - 🌰 Server - ConnectionHandler - 🌰 Server - Behavior I'm sorry it took me so long to give a sign of life. I sunk a lot of hours into doing it with request-response just to discover I need to do it without it. Additionally I couldn't work on it for the last week since I had to study for a midterm.  ## Notes & open questions  ## Change checklist - [ ] I have performed a self-review of my own code - [ ] I have made corresponding changes to the documentation - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] A changelog entry has been made in the appropriate crates

And while there’s no definite ETA yet, this comment makes me feel, that it’s not that far away anymore :

github.com/umgefahren/rust-libp2p

Comment by thomaseizinger - feat(autonatv2): Implement autonat v2

umgefahren:transport-redesign ← umgefahren:implement-autonat-v2

To ensure we are on the right track, here is my view of what we still need from …this work: - Integration of the dial-back ACK from client to server. We need to make sure that we are only sending this once the `Behaviour` has processed the nonce to ensure data consistency. This is a bit tricky / annoying because the Rust ecosystem doesn't have good async generators yet so writing a sequential stream that yields is annoying. Why do we need this? Well, the functionality we _should_ implement is: - Read nonce from the stream - Yield the nonce to the behaviour - Wait - Have the behaviour signal that it processed the nonce (remember that they run on different tasks, so actually in parallel) - Send the `Ok` response on the stream From the perspective of the handler, this is a sequential operation that maps to a stream which yields on a single item (the nonce) and later terminates (after sending the `Ok` response). The waiting we can model with a `oneshot` where we pass the `Sender` up to the `Behaviour` and signal that things are ready to continue by waiting on the `Receiver` inside the handler. Importantly, I think we also want all of this to be within a timeout, meaning that using `futures_bounded::StreamSet` would be good. I think we will need something like (pseudo-code): ```rust struct State { io: swarm::Stream, oneshot: Option<Receiver<()>> } let stream = futures::stream::unfold(state, |mut state| async move { if let Some(receiver) = state.oneshot { if receiver.await.is_err() { // Fail here: behaviour did not process nonce successfully (i.e. `incomingNonce` got dropped earlier). // Maybe we can just `?` together with the below IO errors? }; state.io.send(DialBackOk).await; // TODO: Error handling. state.io.close().await; return None; // Signal stream finished. } let nonce = io.next().await; let (sender, receiver) = oneshot::channel(); let yielded = IncomingNonce { nonce, sender, }; state.oneshot = Some(receiver); Some((yielded, state)) }); ``` The above is a rough sketch. Let me know if you get the idea :) - Once our unit-tests are stable, we will need a smoke-test with `go-libp2p` to ensure that it works correctly. - Fix the bugs that turn up during the smoke test. - After all that is confirmed, we have successfully validated our design and can continue work on https://github.com/libp2p/rust-libp2p/pull/4568 to make it compile and merge it. - Move this PR to `rust-libp2p` and merge that. - Cut a new breaking release with this feature.

Topic		Replies	Views
NAT Traversal - possible ideas and solutions Development	16	995	February 20, 2023
NatNet [May 26 Testnet 2023] [Offline] Releases	131	3157	June 1, 2023
Nat detection still has some unnecessary issues Support	11	354	August 25, 2024
PunchNet [24/04/24 Testnet] [Offline] Releases	656	4343	May 9, 2024
Update 27 April, 2023 Updates	25	2177	May 7, 2023

Related topics