safe networks sections
I just wanted to ask something about an idea I had.
I was thinking we could have a little web app that had a database of all the testnets that had ever been deployed and what their online/offline status is.
Clicking an entry on the list would take you through to get the instructions and network contacts and so on for that particular network. It would state the ‘reason’ or what we’re trying to achieve with this particular testnet and before we take it down we could record the results, so we would have a history. That would be the public part of the app.
For people who could login, we can use it to launch a testnet with the specified number of nodes, versions and so on. We could deploy a comnet instance of the app to another cloud provider so that you guys could launch your own networks easily.
Would anyone be interested in this?
I just killed a node that seemed unresponsive and tried again - this time with the OTLP env var set
willie@gagarin:~/projects/maidsafe/DBC-testing$ export OTEL_EXPORTER_OTLP_ENDPOINT="http://dev-testnet-infra-543e2a753f964a15.elb.eu-west-2.amazonaws.com:4317"
willie@gagarin:~/projects/maidsafe/DBC-testing$ safe node join --network-name feb2
Creating '/home/willie/.safe/node/local-node' folder
Storing nodes' generated data at /home/willie/.safe/node/local-node
Starting a node to join a Safe network...
Starting logging to directory: "/home/willie/.safe/node/local-node/"
The opentelemetry traces are logged under the name: sn_node_369ZYzT7qo
Node started
OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (Unknown error): , detailed error message: sending_queue is full
Node PID: 196199, prefix: Prefix(10), name: bcca52(10111100).., age: 5, connection info:
"0.0.0.0:50009"
willie@gagarin:~/projects/maidsafe/DBC-testing$ OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (Unknown error): , detailed error message: sending_queue is full
OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (Unknown error): , detailed error message: sending_queue is full
So we have three sections and one is like this:
| XorName | Age | Address |
| 0660c1.. | 5 | 138.68.170.76:53645 |
| 0ee7af.. | 5 | 159.65.19.68:40482 |
| 15cdcf.. | 5 | 138.68.147.238:53631 |
| 22d823.. | 5 | 178.62.26.90:41163 |
| 452091.. | 5 | 178.128.37.48:53097 |
| 4d490f.. | 5 | 139.59.169.19:55418 |
| 64a761.. | 5 | 159.65.23.21:60487 |
How can it be that they all have the age 5? I got that age right away when I joined, and I am not an elder. And aren’t the elders supposed to be older than newcomers anyway?
yes please @chriso
Linux fedora
VPN
[koan@i9fed ~] safe networks add feb2 https://sn-node.s3.eu-west-2.amazonaws.com/testnet_tool/feb2/network-contacts
Network 'feb2' was added to the list. Network Map is located at 'PublicKey(13ec..912f), url: "https://sn-node.s3.eu-west-2.amazonaws.com/testnet_tool/feb2/network-contacts"'
[koan@i9fed ~] safe node join --network-name feb2
Storing nodes’ generated data at /home/koan/.safe/node/local-node
Starting a node to join a Safe network…
Starting logging to directory: “/home/koan/.safe/node/local-node/”
The opentelemetry traces are logged under the name: sn_node_gMq8wC1FS3
Node started
[koan@i9fed ~]$ OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (The service is currently unavailable): , detailed error message: error trying to connect: tcp connect error: Connection refused (os error 111)
(PID: 28588): Encountered a timeout while trying to join the network. Retrying after 30 seconds. Node log path: /home/koan/.safe/node/local-node/
OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (The service is currently unavailable): , detailed error message: error trying to connect: tcp connect error: Connection refused (os error 111)
(PID: 28588): Encountered a timeout while trying to join the network. Retrying after 30 seconds. Node log path: /home/koan/.safe/node/local-node/
OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (The service is currently unavailable): , detailed error message: error trying to connect: tcp connect error: Connection refused (os error 111)
(PID: 28588): Encountered a timeout while trying to join the network. Retrying after 30 seconds. Node log path: /home/koan/.safe/node/local-node/
OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (The service is currently unavailable): , detailed error message: error trying to connect: tcp connect error: Connection refused (os error 111)
(PID: 28588): Encountered a timeout while trying to join the network. Retrying after 30 seconds. Node log path: /home/koan/.safe/node/local-node/
OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (The service is currently unavailable): , detailed error message: error trying to connect: tcp connect error: Connection refused (os error 111)
(PID: 28588): Encountered a timeout while trying to join the network. Retrying after 30 seconds. Node log path: /home/koan/.safe/node/local-node/
My first test, in case it’s useful,
my level is only user
They have not relocated so not aged. This is an area the stable set “fixes” and we would not have sections with all age 5. This section is in danger as folk will simply switch off computers to play here and that will kill it. But we will learn from it and already internal slack has identified a bunch of small tweaks we need to make and a slightly larger fix (stable set).
All in all this is a great testnet from our perspective, although it may not seem it. i.e. the Put fails are likely not failing and also we retrieve here from single nodes so folk churning their nodes will cause havoc, but seeing that is good.
@nos
When you paste output to the msg compose window, highlight all your terminal output and hit the
formatted text icon as shown above
Makes it much easier to read.
Still cant get a node involved
scott@scott-desktop:~$ safe node install
Downloading sn_node version: 0.78.2
Downloading https://sn-node.s3.eu-west-2.amazonaws.com/sn_node-0.78.2-aarch64-unknown-linux-musl.tar.gz...
Error:
0: Error downloading release from 'https://sn-node.s3.eu-west-2.amazonaws.com/sn_node-0.78.2-aarch64-unknown-linux-musl.tar.gz'
1: UpdateError: Download request failed with status: 404
Location:
sn_cli/src/operations/helpers.rs:91
Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
There is a mistake somewhere in the script, it works with manually selected version:
safe node install -v 0.73.2
I’m now getting the same error as @nos just this message recurring
OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (The service is currently unavailable): , detailed error message: error trying to connect: tcp connect error: Connection refused (os error 111)
Getting the same messages when trying to joyn with telemetry
Also not been able to put:
safe files put ele
phant.gif
Error:
0: ClientError: Did not receive sufficient ACK messages
from Elders to be sure this cmd (MsgId(17e4..4e22)) passe
d, expected: 7, received 4.
1: Did not receive sufficient ACK messages from Elders
to be sure this cmd (MsgId(17e4..4e22)) passed, expected:
7, received 4.
Did you remember to run
export OTEL_EXPORTER_OTLP_ENDPOINT="http://dev-testnet-infra-543e2a753f964a15.elb.eu-west-2.amazonaws.com:4317"
before trying the node join?
What do you think I’ll start again.
edit: getting the same error as @stout77 now
OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (Unknown error): , detailed error message: sending_queue is full
(PID: 6275): Encountered a timeout while trying to join the network. Retrying after 30 seconds. Node log path: /home/scott/.safe/node/local-node/
OK, I killed my node, installed everything again and and started again with proper OpenTelemetry settings. Now I’m getting a wall of this:
OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (Unknown error): , detailed error message: sending_queue is full
Also, the re-installation didn’t properly clean my machine, as I still have:
- register folder
- sn_node.log
- reward_secret_key
- reward_public_key
from previous run earlier today.
And right under “Chunks” folder there are two folders:
0 created at 23:02 local time 1,1GB (this is the current run)
1 created at 21:18 local time 560,9MB (previous run)
Also my sn_node.log is totally empty. Maybe it is supposed to be that way as I didn’t set any logging at all?
I think I’ll leave it on for the night. I’m heading to sleep soon.
Thanks @Maidsafe and everyone again!
But those files can’t be catted in any case. At least not the one I tried.
Yes, some may fail, but it’s likely some Gets are failing as we ask a single node in this testnet, i.e. we are not querying all replicas. It is good to see though
I will wait on this change, before trying again to join the next testnet where the use of this var is optional, and is based on the personal preference of sn_node operator’s decision to ship or not ship their local logs over to an external endpoint.
I tried today without setting this variable intentionally, and got the same OpenTelemetry errors as others, along with an error: Encountered a timeout while trying to join the network. Retrying after 30 seconds
, also similar to others.
I did notice that sn_node --help
no longer lists --public-addr
but safe node join --help
lists --public-addr
parameter, so if you try to pass in the --public-addr to safe node join command , sn_node complains that its receiving an invalid argument: --public-addr.
If folks want to launch sn_node directly via sn_node as oppose to the safe cli by providing their --public-addr argument (as in the past) and other arguments directly, is that no longer a valid or supported method of starting the sn_node pid?
I hope I didn’t use the wrong safe cli and sn_node version, but the current version I tried with were:
sn_node 0.73.2
sn_cli 0.69.0
this got me too, thought I was being retarded, just sat down to see why, seems I no longer need to
Regarding the OTLP situation, I’m going to look into that.
Thanks for pointing out the situation with the node
command.
To be honest, this makes me think there’s a wider usability issue here. The node install
command is somewhat useful (and I’ve fixed the bug with it using the wrong version - that was because we added a new crate to the release), but I’m actually not very convinced that using safe
to manipulate the node is very useful, and we’ve just now demonstrated that it leads to problems with having two interfaces to maintain.
If the node
command is essentially just exposing the same set of arguments that the node binary does, then what advantage are you really getting from just using the node binary directly?
I think I’m of the opinion now that the node
sub command should be removed from safe
. We could have an install script for the node that’s very similar to safe
.