"community1" Test Network is Alive, Join Us!

I’ve been doing everything at the command line or with shell scripts. The latter will suffice for handling the log.

Even gnuplot is scripting and command-line commands.

With the vault evidently stuck (no new entries in the log for an hour) I did a shutdown, removed the log, and reboot.

It is back on track now with seven entries in the routing table.

Not tested yet but does truncate -s0 Node.log get round about this?

Thanks for the suggestion, I’ll have a look at logrotate.

hi all,

7 in touting table does this tally?

rup

Yes, seven. (20 chars)

I’ve zoomed in on a segment of the x-axis for greater clarity.

Check again - see if its 8 :slight_smile:

INFO 14:17:40.031478724 [safe_vault::personas::data_manager data_manager.rs:476] Stats : Client Get requests received 0 ; Data stored - ID 0 - SD 4 - total 3008 bytes
INFO 14:17:41.911517390 [routing::core core.rs:1574] Node(264d…) Added abb8… to routing table.
INFO 14:17:41.911729189 [routing::core core.rs:414] -------------------------------------------------------
INFO 14:17:41.911767890 [routing::core core.rs:416] | Node(264d…) PeerId(0a32…) - Routing Table size: 7 |
INFO 14:17:41.911789086 [routing::core core.rs:417] -------------------------------------------------------
INFO 14:17:42.217966711 [safe_vault::personas::data_manager data_manager.rs:221] Cache Stats - Expecting 5 Get responses. 21 entries in data_holders.
INFO 14:17:42.330823707 [safe_vault::personas::data_manager data_manager.rs:221] Cache Stats - Expecting 4 Get responses. 17 entries in data_holders.
INFO 14:17:42.400012651 [safe_vault::personas::data_manager data_manager.rs:476] Stats : Client Get requests received 0 ; Data stored - ID 0 - SD 5 - total 4048 bytes
INFO 14:17:42.510215549 [safe_vault::personas::data_manager data_manager.rs:221] Cache Stats - Expecting 3 Get responses. 13 entries in data_holders.

EDIT Just failed again did you reset?

1 Like

Thanks for the suggestion. I tested that and it has the same effect as >Node.log. That is, it doesn’t work if the vault is busy writing to the file. No worries, I’ll use a loop in order to catch the moment when the vault is quiet.

1 Like

I have more insight into the “punched” statistic:

I believe this figure indicates vaults that have managed to use crust’s hole-punching to circumvent a firewall (of course, you might say) but it doesn’t necessarily mean they have direct access to the Internet. They might be on NAT and behind a firewall, but if they have a neighbouring vault on the same host, they don’t need hole-punching.

Consider this case that I tried just now:

I have four or five vaults running on a computer, configured in their respective config files as “TCP acceptor” on various ports, and I have previously redirected those router ports to that computer. I also opened those ports on the computer, although I don’t think that that makes any difference.That works fine, and those vaults show up in the logs as “direct connections” - same as if they were in a data center with an IP.

Now if I add another vault on the same computer, listening on yet another port, but without a redirection to that port in the router, it also appears as a “direct connection” (i.e., not punched). The only reason I can see for this is that it see the other vaults on the same host and requests a tunnelling action, no hole-punching necessary.

Is my understanding of this correct?

EDIT: I ran another vault, accepting TCP on another port, on another computer on the same LAN, with no redirection. result: Direct connections in the log goes up by one but punched connections is still zero. Therefore it isn’t hole-punching. Its log after one minute is pretty big so I’m not sure exactly what it is doing, but it can’t be hole punching. I see this in its log:

DEBUG 16:43:11.910565200 [routing::core core.rs:1704] Adding PeerId(a5f3..) as a tunnel node for 1ca9...
INFO 16:43:11.910565200 [routing::core core.rs:399] Node(9587..) - Indirect connections: 1, tunneling for: 0
DEBUG 16:43:12.044582200 [routing::core core.rs:1535] Node(9587..) Handling NodeIdentify from 1ca9...
INFO 16:43:12.044582200 [routing::core core.rs:1574] Node(9587..) Added 1ca9.. to routing table.

a5f3 is a vault on the other computer. So it somehow found it and added it as a “tunnel node”.

OK, question is: did it find it directly or from the seed vault?

Looking earlier in its log, the other vault “a5f3” is not among those added to the routing table. It first appears on this line:

Received connection info. Trying to connect to PeerId(a5f3..)"

That is only a few lines after it gets its own network name, so I provisionally conclude that the seed vault then passed along the IP of a5f3.

1 Like

Some routers are capable of an action called “hairpinning”. This is where all your nodes may try to connect to external IP addresses on the remote side of your router that you have in your configs. A hairpin capable routier will actually connect direct the nodes in it’s LAN as though they went out and back in, if that makes sense.

NAT traversal is a quagmire of differing things (then shove uPnp in and again it looks direct).

You are right though hole punched would indicate a node has punched through a router to get an incoming connection :thumbsup:

3 Likes

The router’s manual has no mention of hairpinning (NAT loopback).

Well, sure. But I was wondering just what constitutes hole-punching: it appears that being told, by a hard-coded contact, of another node and using that other node as a tunnel, is not hole-punching.

No - consider tunnelling as relaying or proxying for a less capable node.

1 Like

That makes them sound almost human. :smile:

1 Like

Everyone, a daily log-archiving is in place that should work without hiccups. My goal is to keep this network up continuously until testnet4.

2 Likes

FWIW here are the last few lines of my log output

WARN 17:52:56.739078669 [routing::core core.rs:667] Prepared connection info for PeerId(3873…) as f827…, but already tried as f827…
WARN 17:52:57.159612552 [routing::core core.rs:667] Prepared connection info for PeerId(3873…) as f827…, but already tried as f827…
WARN 17:52:57.195381625 [routing::core core.rs:605] Node(fbc5…) Failed to connect to peer PeerId(3873…): Error { repr: Custom(Custom { kind: TimedOut, error: StringError(“Connect failed. errors: (1 of 3) Tcp direct connect failed: No file descriptors available (os error 24) (2 of 3) Tcp direct connect failed: No file descriptors available (os error 24) (3 of 3) Tcp hole punching failed: Error binding another socket to the same local address as the provided socket: Error creating socket: No file descriptors available (os error 24)”) }) }.
WARN 17:52:57.897166349 [routing::core core.rs:667] Prepared connection info for PeerId(3873…) as f827…, but already tried as f827…
INFO 17:52:58.002476894 [routing::core core.rs:399] Node(fbc5…) - Indirect connections: 9, tunneling for: 0
INFO 17:52:58.003130623 [routing::core core.rs:1574] Node(fbc5…) Added f827… to routing table.
INFO 17:52:58.003240304 [routing::core core.rs:414] -------------------------------------------------------
INFO 17:52:58.003269517 [routing::core core.rs:416] | Node(fbc5…) PeerId(7432…) - Routing Table size: 18 |
INFO 17:52:58.003288831 [routing::core core.rs:417] -------------------------------------------------------

Is this useful or is it just noise?

1 Like

Routing table of 18 is consistent with what I see. The rest, also consistent with what I normally see, is crust trying many different routes, hardly any of which will work, and added to the diagnostic verbosity built-in to these early versions of the software.

1 Like

Announcement:

When the next official testnet (testnet4) is begun, I will shut down the community1 seed vault (i.e., the “hard-coded contact”) for maintenance, and participate in the official testnet.

The main piece of maintenance will be to replace the binary, with the binary in the distributable used in the testnet (unless it proves to be unusable for ongoing use, pending results of testnet4).

The plan is then to restart community1 when testnet4 has wound down to the point where its hard-contacts are consistently unreachable, and it seems to be finished, even if an official announcement of its end doesn’t come for a further day or two.

In practice there appears no reason to worry about such overlap, since most people seem stay on official testnets right up to their official closure.

The reason for replacing the binary at that time is that it is desirable to keep the software on community1 up to date, and consistent enough to function as a network. I admit I have no basis for judging to what extent the current rapid changes to the software are backward compatible, and I also doubt that it is possible to spontaneously get everyone to upgrade to a given version, so the testnet is an obvious point to take the lead in doing this, since nearly everyone would have downloaded the redistributable files anyway to participate in testnet4.

3 Likes

Sorry this is late. Maybe the walk through instructions will be useful for future community networks… perhaps Community4 Test Network?

7 Likes

Thank you. Great idea. I’ve had my hands full as it is.

EDIT: Actually, I should say it’s a brilliant exposition. As you say, it has usefulness beyond the immediate testing environment.

1 Like

You might be interested in an update on this, which turned out well in the end:

I found a problem with >Node.log when the archiving script ran at midnight, not a fatal one, fortunately:

The vault sailed right through without interruption, and writing to Node.log, which is the main thing. I could “cat” the live file and it displayed as one would expect.

However, the “>” operation (“redirect nothing to a file”) initializes Node.log with a null character, and if grep sees a null character in a file, it considers it a binary file and, by default, won’t process it. It took me a little while to work out what was going on (no data coming from the grep filtering) and to add the workaround, which is to add the -a flag to grep to force it to treat the input file as text. Then I used the Node.log data to repair the missing section of the plot (i.e., the gap at the start when I was debugging why the plot script wasn’t producing a sensible plot).

1 Like