Community-Test (oct6) Offline

joshuef · October 6, 2021, 3:43pm

With the testnet tool, you should be able to run ./scripts/logs to download all logs from all nodes you have created

To get more folk joining faster, you may want the intial nodes to be built w/ less storage space, or perhaps even the node bins built with always-joinable flag

Josh · October 6, 2021, 3:47pm

I’m not using AWS so when I try that it does not work. (I suspect that’s why?)
I do have an aws s3 bucket set up but that is a convo and help I’ll need in the other thread I guess.

Just ran in to restart, I am busy remodeling my workspace/shed/office but am more than happy to continue working out the kinks.

joshuef · October 6, 2021, 3:49pm

Ahh, yeh that’s fair.

Probs those scripts could/should be updated to run without the need for syncing to/from s3. Opt out at the very least.

Keep up the good work (on sheds+testnets!)

joshuef · October 6, 2021, 3:55pm

large files are probably still problematic.

At the moment the blob cache is removed so we can more readily see underlying issues, and we’re working on those and without this, it only takes one chunk fail to break your large file retrieval…

eg, at the mo there’s still more AE retries for PUTs than you’d want… Which means a chunk can sometimes fail to be put during the span of our tests.

And right now we still dont verify PUT data as part of the CLI funcs, so until you’re able to retrieve the file, you cannot really assume that it’s all stored properly I’m afraid.

That shouldn’t be much to verify + retry for missing chunks eg. But it’s another thing on the list basically! So until that’s it, you should retry your PUTs if your GETs fail.

Vort · October 6, 2021, 3:58pm

I did not measured, but between cat activity stopped and error appeared, about 1 hour of time passed.
Retries under these circumstances may be problematic.

joshuef · October 6, 2021, 4:29pm

Josh:

I am getting a bunch of these on the genesis log

{"timestamp":"Oct 06 14:05:13.814","level":"ERROR","fields":{"message":"some error: TimedOut"},"target":"qp2p::endpoint","threadName":"tokio-runtime-worker"}

these (not so helpful errors; which i think will be refactored out of qp2p in the not too distant future), are just saying that some connection timed out.

That can happen if eg, a client or node disconnected in a less than graceful fashion (ie the process was force quit).

So not something to worry about per se. (Though if they can be tied to a specific instance of a node/client and buggy behaviour there, it’s good to know… but that’ll likely be hard on an open testnet)

Josh · October 6, 2021, 4:50pm

Solved if I SSH in yes but not if I use their browser console. seems odd so I’ll put that down to user error too.

Southside · October 6, 2021, 5:04pm

I’m getting timeouts now - using the exact same command as worked previously

Vort · October 6, 2021, 5:11pm

May not be related to large files then.

Josh · October 6, 2021, 5:23pm

me too 12345char

happybeing · October 6, 2021, 5:26pm

I can’t connect. Am using the updated command from the OP:

RUST_LOG=info $HOME/.safe/node/sn_node --hard-coded-contacts '["178.62.57.96:12000"]' --genesis-key 8a551b912e8f08ce6a6440b324d1b0e42e304957cd01642ba628138c63e89242a03412460639774ac5c6cbcad0c08edb --local-addr 165.227.228.183:0 --skip-igd
Starting logging to stdout
Oct 06 19:23:53.322  INFO sn_node: 

Running safe_network v0.32.0
============================
Oct 06 19:23:53.325  INFO safe_network::routing::routing_api: 3219b9.. Bootstrapping a new node.
Error: 
   0: Cannot start node. If this is the first node on the network pass the local address to be used using --first
   1: Routing error:: Cannot connect to the endpoint: Failed to bind UDP socket
   2: Cannot connect to the endpoint: Failed to bind UDP socket
   3: Failed to bind UDP socket
   4: Address not available (os error 99)

Location:
   src/bin/sn_node.rs:201

joshuef · October 6, 2021, 5:26pm

I didn’t mean to imply large files == timeouts before.

Just that you shouldn’t expect large files to work atm (or at least: i don’t )

davidpbrown · October 6, 2021, 5:43pm

Unclear the state atm but 1.3MB upload fell over with some error I can’t parse:

Error: 
   0: e[91mNetDataError: Failed to get current version: NetDataError: Failed to read current value from Register data: NoResponsee[0m

Location:
   e[35m/rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b/library/core/src/result.rse[0m:e[35m1897e[0m

Backtrace omitted.
Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

Josh · October 6, 2021, 5:49pm

I am going to restart 1 more time with reduced max_capacity as suggested by @joshuef but mostly so that anyone who has not had time to play and would like to, can.

joshuef · October 6, 2021, 5:50pm

If you can keep your node logs to debug any of this timeout lark that’d be suuuper

Josh · October 6, 2021, 6:36pm

ok, I grabbed them, let me know where to put/send them.
(Three nodes had no logs)

Take 3 is up!

davidpbrown · October 6, 2021, 7:16pm

upload and dog are fast
but cat is stalling and giving nothing…

There should be the 1.3MB safe gif at
safe://hygoynyybecgepgc4d1rbz1qa9r3hhr5hozhdg47uguqo5a53iptdntbukx5y

dog works

safe dog safe://hygoynyybecgepgc4d1rbz1qa9r3hhr5hozhdg47uguqo5a53iptdntbukx5y

== URL resolution step 1 ==
Resolved from: safe://hygoynyybecgepgc4d1rbz1qa9r3hhr5hozhdg47uguqo5a53iptdntbukx5y
= File =
XOR-URL: safe://hygoynyybecgepgc4d1rbz1qa9r3hhr5hozhdg47uguqo5a53iptdntbukx5y
XOR name: 0x430c86999a1c881bc9d8f933ce137c85f8336bb334dd0de379ab6231443353f6
Native data type: PublicBlob
Media type: image/gif

Josh · October 6, 2021, 7:28pm

yes I can’t cat it either.

I can cat my test file though.
safe cat safe://hygoygyybtrdgdisz1u9my5wpthjg8ziuw4sc7xatmnt85yn7yj5m7gkc5mgy > 12345.jpeg

Test is over… for now.

Nigel · October 6, 2021, 11:22pm

Blast! Missed the fun. I’ll be waiting in the wings.

Josh · October 6, 2021, 11:42pm

I went through the logs for the second iteration, below is all that I can find.

root@alpha-safe-node-2:~/logs# cat sn_node.log.2021-10-06-15
“timestamp”:“Oct 06 15:27:16.923”,“level”:“ERROR”,“fields”:{“message”:“Error encountered when handling command: UntrustedProofChain("provided proof_chain doesn’t cover the SAP’s key we currently know: SectionAuthorityProvider { prefix: Prefix(), public_key_set: PublicKeySet { public_key: PublicKey(1082…b98a), threshold: 1 }, elders: {497347(01001001)…: 159.65.48.36:12000, d3587d(11010011)…: 178.62.57.96:12000} }")”},“target”:“safe_network::routing::routing_api::dispatcher”,“threadName”:“tokio-runtime-worker”}

“timestamp”:“Oct 06 15:36:57.265”,“level”:“ERROR”,“fields”:{“message”:“Sending message (msg_id: MessageId(7639…1831)) to 47.202.65.195:38087 (name 949436(10010100)…) failed with Some(ConnectionLost(TimedOut))”},“target”:“safe_network::routing::core::comm”,“threadName”:“tokio-runtime-worker”}

Topic		Replies	Views
Community Test 13 November - offline Community	211	2984	November 25, 2021
Update 02 February, 2023 [The feb2 testnet - Offline] Updates	135	3860	February 27, 2023
OFFLINE Will it be a Quicky? (run 4) Community community-test	173	3246	December 23, 2021
Joshnet [May 4th Testnet 2023 ; Offline] Updates testnet	363	5259	September 6, 2023
Update 15 December, 2022 Updates	141	2610	December 27, 2022

Community-Test (oct6) Offline

Related topics