"community1" Test Network Has Restarted, Join Us!

hintofbasil · June 26, 2016, 4:41pm

While there definitely seems to be something odd happening here I’m not sure it’s quite that simple.

Before you reset your vault I was unable to connect getting a “failed to bootstrap” error. After the reset I have managed to connect 6 vaults to your test net. 5/6 of these vaults have a routing table size of 7 while one has a routing table size of 6. It seems my nodes have almost all found each other.

Are you able to see my nodes connected to yours?

It could also be worth making an issue on the github issues page too.

Edit: Another node just joined. Now have 5/6 with a RT of 8 and 1/6 with a RT of 7.

bluebird · June 26, 2016, 4:43pm

The other hard-coded contact is now online a few minutes ago, and is bootstrapping some nodes.

bluebird · June 26, 2016, 4:45pm

Here’s a thought:

My “seed vault” is unique in that it starts the network with the “first” flag. It could be that a “first” vault is only allowed to accept a couple of client vaults. So it is preferable if those client vaults it accepts are other hard-coded contacts.* By luck, with the starting and stopping, the otehr hard-coded contact has managed to bootstrap from mine.

* until caching is turned back on. By the way, I found the boolean constant for that (set to false) but I’ll leave it for now.

davidpbrown · June 26, 2016, 5:02pm

It lives! Routing table went to 13 then dropped to 9… and registering account failed.

Don’t stop nodes that are working… perhaps we can build from what is up now.

I started two instances of vault. One is still up… and routing table at 9 atm
the other crashed with a new error: https://github.com/maidsafe/safe_vault/issues/507

thread 'Crust PeerId(4ffc..) event loop' panicked at 'Logic Error', ../src/libcore/option.rs:700
note: Run with `RUST_BACKTRACE=1` for a backtrace.
thread 'Node thread' panicked at 'called `Result::unwrap()` on an `Err` value: "PoisonError { inner: .. }"', ../src/libcore/result.rs:746
thread 'Node thread' panicked at '

 ==============================
| Result evaluated to Err: Any |
 ==============================

', /home/travis/.cargo/registry/src/github.com-88ac128001ac3a9a/maidsafe_utilities-0.6.0/src/thread.rs:37
stack backtrace:
   1:           0x77c5a0 - std::sys::backtrace::tracing::imp::write::h4c73fcd3363076f5
   2:           0x78022b - std::panicking::default_hook::_$u7b$$u7b$closure$u7d$$u7d$::h0422dbb3077e6747
   3:           0x77feb3 - std::panicking::default_hook::haac48fa641db8fa2
   4:           0x76b5bf - std::sys_common::unwind::begin_unwind_inner::h39d40f52add53ef7
   5:           0x76c858 - std::sys_common::unwind::begin_unwind_fmt::h64c0ff793199cc1b
   6:           0x5d5f7f - _<thread..RaiiThreadJoiner as std..ops..Drop>::drop::h2f72648cf85a5235
   7:           0x4ad812 - crust..Service::drop_contents.23336::hf90ae88ccf4b1bf8
   8:           0x4ad1ce - core..Core::drop.23331::hf9130c13607d9797
   9:           0x560225 - std::sys_common::unwind::try::try_fn::ha116764aacfa08c1
  10:           0x77b77b - __rust_try
  11:           0x77b70d - std::sys_common::unwind::inner_try::h9eebd8dc83f388a6
  12:           0x5603ea - _<F as std..boxed..FnBox<A>>::call_box::had2bfc8e3c719a1b
  13:           0x77ef34 - std::sys::thread::Thread::new::thread_start::h471ad90789353b5b
thread panicked while panicking. aborting.
Illegal instruction

d’oh - vault stopped itself… with

WARN 18:04:18.470387714 [safe_vault::vault vault.rs:145] Restarting Vault

that just after the number in the routing table seemed to drop 8.7.6…

bluebird · June 26, 2016, 5:07pm

Hard to say, if it’s just one vault.

Both hard-coded contacts are running “latest”, btw. We truly are the bleeding-edge network now.

davidpbrown · June 26, 2016, 5:10pm

and trying to restart vault it again doesn’t try, as if seeds are offline again.

You could try running the seed vault with a perpetual reboot…

while true; do safe_vault; sleep 2; done

rupert · June 26, 2016, 5:22pm

anyone know what the udp port does?

root@ncac111:/tmp# netstat -l
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 *:ssh                   *:*                     LISTEN     
tcp        0      0 *:5483                  *:*                     LISTEN     
tcp6       0      0 [::]:ssh                [::]:*                  LISTEN     
udp        0      0 *:5484

rup

bluebird · June 26, 2016, 5:23pm

I restarted it just now. My local vault connected straight away. I then disconnected the local vault to leave room for anyone else to connect.

One more thing to note: My local router (not the seed vault, which is in the cloud) has port forwarding of 5483 to the host of the local vault. Since service discovery is crippled for now then port forwarding, just like on the first testnet, if you’re on NAT, might help.

rupert · June 26, 2016, 5:30pm

not sure if your restart etc caused it… but my vault on 185… has crapped out and now is back to not being able to bootstrap… ugh!

rup

davidpbrown · June 26, 2016, 5:30pm

Odd that again the vault - fresh instance, doesn’t try, just falls back to the commandline without comment.

davidpbrown · June 26, 2016, 5:33pm

udp likely doesn’t do anything and as a protocol is apparently less reliable than tcp. Given the success with tcp, devs I think stopped needing to use it, so it’s there perhaps only because the port is suggested for it. In the future perhaps other protocols than tcp might be useful.

bluebird · June 26, 2016, 5:36pm

Yes, when i tried just now to run my local vault it just exited to the commandline. Restarting the seed vault gets it working again.

davidpbrown · June 26, 2016, 5:49pm

Ok…

http://hello.safenet

rupert · June 26, 2016, 5:49pm

so… i suspect on your restarting the vault has not simply crapped out straight away…

but has gone back to the cycle:

INFO 17:46:28.991947115 [routing::core core.rs:1807] Client(953be0..) Failed to get GetNodeName response.
WARN 17:46:28.992135295 [safe_vault::vault vault.rs:171] Restarting Vault
INFO 17:46:31.330784856 [routing::core core.rs:1032] Client(7ce39e..) Running listener.
INFO 17:46:31.392824366 [routing::core core.rs:1555] Client(7ce39e..) Sending GetNodeName request with: PublicId(name: 7ce39e..). This can take a while.
INFO 17:47:31.392808845 [routing::core core.rs:1807] Client(7ce39e..) Failed to get GetNodeName response.
WARN 17:47:31.393059294 [safe_vault::vault vault.rs:171] Restarting Vault
INFO 17:47:33.737790659 [routing::core core.rs:1032] Client(83de3e..) Running listener.
INFO 17:47:33.798988385 [routing::core core.rs:1555] Client(83de3e..) Sending GetNodeName request with: PublicId(name: 83de3e..). This can take a while.

will leave running and see if anything you can do at your end bb can get it back up…?

rup

bluebird · June 26, 2016, 5:50pm

This is what I was referring to earlier. Setting it to true would help all those table entries that drop off and can’t be recovered:

github.com

maidsafe-archive/crust/blob/master/src/main/bootstrap/cache.rs#L23


      
          use safe_crypto::PublicEncryptKey;
          use std::collections::HashSet;
          use std::ffi::OsString;
          use std::time::Duration;
          
          
/// Bootstrap cache specific configurable settings.
          #[derive(PartialEq, Eq, Debug, Serialize, Deserialize, Clone)]
          pub struct CacheConfig {
              /// File path for bootstrap cache.
              pub file_name: Option<OsString>,
              /// Maximum number of node contacts that will be cached for bootstrap.
              pub max_size: usize,
              /// Timeout for inactive cache entries in seconds.
              pub timeout: u64,
          }
          
          
impl Default for CacheConfig {
              fn default() -> CacheConfig {
                  CacheConfig {
                      file_name: None,
                      max_size: 200,

davidpbrown · June 26, 2016, 5:50pm

It’s working now, so try restarting any vault that isn’t obviously connected.

Incidentally, though ill advised for a busy network, I’m running three vaults atm… so that is possible and can only help very fragile network get up and running.

davidpbrown · June 26, 2016, 5:51pm

Yes, but that would be cheating

davidpbrown · June 26, 2016, 5:57pm

So, that didn’t last very long!

All vaults failed at around the same moment.

Restarting give another variation:

INFO 18:55:11.409107366 [safe_vault safe_vault.rs:96] 

Running safe_vault v0.9.0
=========================
INFO 18:55:13.783470544 [routing::core core.rs:1171] Bootstrapping(PeerId(d95b..), 0)(36425e..
Connection failed: Proxy node needs a larger routing table to accept clients.

and the drops to the commandline, which I’ve not seen before with that comment.

I think I shall wait for Test5 now.

bluebird · June 26, 2016, 5:59pm

I have now set the seed vault to restart every two minutes.

EDIT: I’ll attend to other matters for now, and leave the seed vault restarting on a two minute cycle.

rupert · June 26, 2016, 6:40pm

and off we go:

INFO 18:38:36.971561498 [routing::core core.rs:1032] Client(caa178..) Running listener.
INFO 18:38:37.051698360 [routing::core core.rs:1555] Client(caa178..) Sending GetNodeName request with: PublicId(name: caa178..). This can take a while.
INFO 18:38:37.121847866 [routing::core core.rs:1371] Client(0c645d..) Added a9a0b1.. to routing table.
INFO 18:38:37.123290646 [routing::core core.rs:411]  --------------------------------------------------------- 
INFO 18:38:37.123395436 [routing::core core.rs:413] | Node(0c645d..) PeerId(3ca8..) - Routing Table size:   1 |
INFO 18:38:37.123504614 [routing::core core.rs:414]  --------------------------------------------------------- 
INFO 18:39:25.882625433 [routing::core core.rs:1371] Node(0c645d..) Added 6c6174.. to routing table.
INFO 18:39:25.882753338 [routing::core core.rs:411]  --------------------------------------------------------- 
INFO 18:39:25.882816074 [routing::core core.rs:413] | Node(0c645d..) PeerId(3ca8..) - Routing Table size:   2 |
INFO 18:39:25.882903510 [routing::core core.rs:414]  --------------------------------------------------------- 
INFO 18:40:04.168851683 [routing::core core.rs:1371] Node(0c645d..) Added a1348c.. to routing table.
INFO 18:40:04.169117319 [routing::core core.rs:411]  --------------------------------------------------------- 
INFO 18:40:04.169193293 [routing::core core.rs:413] | Node(0c645d..) PeerId(3ca8..) - Routing Table size:   3 |
INFO 18:40:04.169261782 [routing::core core.rs:414]  ---------------------------------------------------------

its real weird!!

rup

Topic		Replies	Views
"community1" Test Network is Alive, Join Us! Development test_network	152	9347	June 11, 2016
Join the NEW Community Test Network (Running Now): "community1" Development	244	11040	May 21, 2016
Community-run Testnet Info! Development	54	4956	May 13, 2016
User run network based on test 12b binaries Community	466	13933	February 24, 2017
SAFE Network - TEST 2 - Update (7th May 12:15 BST) Now complete Updates	203	11300	May 21, 2016

"community1" Test Network Has Restarted, Join Us!

Related topics