NodeDiscoveryNet [07/07/23 Testnet] [Maidsafe nodes offline]

That means that the app doesn’t know enough other nodes to initiate a request. So it’s waiting for more info.

The client eg always want to have 20 peers before it starts. That should normally happen very fast.

(cc @qi_ma , I wonder if we can relax that with kad::put changes of late?)

3 Likes

Checking out the state of things: we’re down to about 13-15 nodes per droplet now. That seems pretty across the board, so that may be our new target of nodes/machine.

Here’s a bit of a random sample of node’s state (we’ll be looking to automate putting this to s3 so folk can pull and work with this/any other data for the maidsafe nodes):

Node PeerId                                             PID      Memory (MB)     CPU (%)    Record Count
12D3KooWS7ATnNVDp281wShdPhGcS9Az9TKLaiA2b1NHHAr5iYEU    2901     139.645         0.0        1139
12D3KooWGExuh9YkgxVuXLh3RJPaSW2F4VZZ1b1S3dFLW7wLWdhu    2920     218.16          0.0        1049
12D3KooWGiRciszcRb92MK1TcdjrJbVzRHDWKfvExq36z4foJWmk    2784     186.555         0.0        2271
12D3KooWRbEhRd2yhgRssDvLaVukoKp6DhaueARQ94zaTMNoBK6g    2794     186.535         0.0        1112
12D3KooWKfYCD4o1nCyqXAK5QTifUp95k5jowE4UGq1KVqWp6sGD    2844     228.637         0.0        1078
12D3KooWHJpiC8rMKhNX7XoJC7CoYtgeKZkabvgbmjatNhop8WUj    2882     290.285         0.0        1210
12D3KooWN1rcVnZD7KaJQ3EBeSigYPLN9rFWH9BtEx3AauhBD8yf    2854     269.758         0.0        1223
12D3KooWFWnepnWkg3nr7komwCuPexvFoT2Bvzgp3C6ASUA7euo3    2744     229.953         6.7        990
12D3KooWP4GYsX8sbJMQv3Gq3ojk4WW5hG8Z5KjHm25AztmHufEk    2734     322.555         0.0        1558
12D3KooWEuvppN95JrQmD9SwkJZa1vQQVU2TF8vaxQVeZuhRLSx6    2814     316.145         0.0        1596
12D3KooWSiM7gTgnq6yfvu7rdn3jn33Hea2nLW1UAmze6xYbW8MY    2911     242.23          0.0        1620
12D3KooWGQzE8r5HBnFFxdxsGpKTfBAFXtNH9nmxKTuv7UGgMMNB    2891     238.449         0.0        1609
12D3KooWG66hmm7XM7PhA27yACP6p8RBBY7mJasSui7Gp2mxtRpu    2754     182.668         0.0        915
12D3KooWBPG1jzJYxt3Q9NkXy7E8HdnApjoPQp5DV5m9hmXCE8mr    2764     261.508         0.0        1477
12D3KooWANuNsnVt56m6XmDvoMiF7rjPvzSHemHZgdN5pK4iJCwQ    8943     244.715         0.0        1071
12D3KooWSGNhhwK31k57HG1zt1Pgd7rp2oeXQgRbzjT5Aak4i9pR    8895     292.047         0.0        1028
12D3KooWMWBDV93UcLYUczwmZiJnjjvbWvYiDd9D9euD9n6VShjG    9060     149.34          6.7        829
12D3KooWREdgsvW4FP4Zxo8WTxtLE8zk8S7mnorLUxYQvTx2JPU8    9070     125.094         0.0        639
12D3KooWJLN6fvxa7oYSfKvefEcmUd1cPkkfWkfpueUA8Kx5WZz6    8915     234.277         0.0        961
12D3KooWCUfzjFMMJ8gVbHGNrYKTuAocxZeE9cFGSSn4TmjZaL2n    8973     234.164         0.0        1190
12D3KooWQZr8stkiAzW5AMtt2egHBFP5FUjSg2uoBZn5zG9atGrp    9032     249.309         0.0        1528
12D3KooWH1CNLXPmc8DekjX8J4kheLLctBiirifw3NVWu5HeaLw3    8963     419.84          0.0        1475
12D3KooWNZQFxNgCMUXG3NHyqiRVUJ2gCTNaBEcy3bQZrqDssiNy    8925     395.094         0.0        1256
12D3KooWKFetWEifQ8tPSfFS86EaDUArB424TnrmZWS25RfNa7MP    8983     157.289         0.0        2052
12D3KooW9wdFCmdaePSL2ob9p8w3QoTEHRmyR2L3bM3uzof6hCRg    8992     249.254         0.0        1093
12D3KooWQmHKpnm6hXMLoyeZSmVUawyZBbyYyHzGk9DeBD5vP3wd    9041     231.762         0.0        941
12D3KooWS33aPXceEW2xr6JisR32fjkm8ZcJZmBB8ihCfbjmuC8u    9002     284.926         0.0        2019
12D3KooWCBqn2h7QPcJcXHL1bCUuVPEgjgUjD8CDDaazrB3gfgzs    8885     190.445         0.0        650
12D3KooWDyGoCHD2wgRFtsvrCN8yWzqW8zQV2SRpLNXLwonFTAs1    8707     266.746         0.0        2009
12D3KooWPfQy4tifRdd8dwHrL5FmVAGhsc3JTRa2Rp5NY7qqgdKz    2692     227.492         0.0        1377
12D3KooWKj97KhhJr2rSik9Wz3goTgtbfzfkC1EzVyecq31Wx6Xq    8555     255.852         0.0        1175
12D3KooWA74SkJBDev1M72m418gUSzDbPAGs7Dp9WG85Zp13MFoS    6074     281.523         0.0        1866
12D3KooWJbZeARHEMQ4FnabK4TUMDGy9JpVrWZCVCUm4AikugLKg    8749     361.809         0.0        1781
12D3KooWH8Aeih97LuzfETdJ6zp8fPt67cVqLa4rJeNqk7eYKM3b    8600     266.676         0.0        1448
12D3KooWRQmb82qCDc7iKkYjaDaVk3cSRgT7uq8uycBguj2pnCux    8850     275.289         0.0        893
12D3KooWCiViubkSU3xjNsMZ2VMzKgUPVYpf9PnLmBAuUtU2cBjc    8315     235.098         0.0        1259
12D3KooWKSeYU4zh3M3oxBiQTkJoiug7yj5kJCHx5zAhsXkmEQVC    8840     212.898         20.0       1156
12D3KooWH9D6ARCmxKv1pQ5oT1DLzoJrQ29TdwjkMJgUPKg7v4Aw    8510     262.555         0.0        1067
12D3KooWDmPW6a6BMfbJtmsMkpwuZshPrDgDXua9LnceRQd2wTrA    8254     338.426         0.0        2360
12D3KooWSromHHAK598Vo3ry1rvbbrbi8E25idpW3RJgZTSgu2QU    8274     256.664         0.0        1221
12D3KooWSADUX6sEvcxfssjtmNHqPXHte3J9wjDLdKR8GPwScbm6    8792     199.059         0.0        1464
12D3KooWEJ51zEmiTM598D3neGpo9eHh5qJhQ15jTWoArvAxtW5Z    2930     202.094         0.0        1332
12D3KooWLafFSFHTdFNSJSgTJBswx3tG9fG9mZgnAVzKVY9ZW2ku    2912     270.609         6.7        1151
12D3KooWSjFjouAXEfbEykqrqTo8ZCPfeLL65Apd216ztiz1KsQf    2921     247.738         0.0        1469
12D3KooWGupTEnx5oMtDvjSinQgiXuZRhnDfjgeW6D83SozwUdRV    2849     154.535         0.0        900
12D3KooWCB4jtkNoNuBqW69WiBDjSaGCC4Vzqe3bbJCTJjcUU4MV    2830     237.578         0.0        1523
12D3KooWCUn5eAtKN1bcnMZ8786pp5Ppg52kkZbmGfQBnox8v1cW    2773     253.188         0.0        1395
12D3KooWEFjpAqLHBb4Hpo8zPU91hhtYcqyYyVgW5bqS5EWfsi1P    2763     206.453         0.0        715
12D3KooWM1hjM2xkKPD5ASkR3qGMzC7ThnQuXkh2XWDMxRKfLDDv    2858     225.238         0.0        1556
12D3KooWCZVoRdkPtMxDqFCPrsVsA6X65tXmLfFxQ16sApp55w4U    2821     270.238         0.0        1630
12D3KooWMqw9pfPSJ4qqZHyWoX3aqMYgi15nNoqk8g1mRFRsPzp2    2894     258.746         0.0        762
12D3KooWPDnaNWoxvjmmogq4sQuGfVWBusTZFUPAuHD4JRtYAveM    2876     302.629         6.7        1752
12D3KooWEY2RyqY3ydwaiTULdX5vEZJXSkcKQx2z2S8qWdj8BKvD    2885     132.152         0.0        734
12D3KooWFHpAjw9FLbLPm2FCNd2yn72u3XrJcSyDZgroZCWM5WER    2753     159.891         0.0        764
12D3KooWEzwh1jgfahtAybtwjS23SbDUdZmmxDuMQUhvCm4fPB2c    2903     170.586         0.0        1250
12D3KooWCgYE4oYrZ8GWbDGJGGAX26dKAoNc3xk2y5VBkSW8YBtb    2867     222.672         0.0        2197

As you can see some nodes are approaching capacity (const MAX_RECORDS_COUNT: usize = 2048; for this testnet), although with churn etc they’re likely holding more records than they should be.

I am still able to download out test data bundle.

10 Likes

Please, is there a tutorial describing how to start the node?

4 Likes

The OP has the instructions to run the node. Can you try that and post here any problem you have doing so, if any?

2 Likes

I am not sure of the command I should run once I installed the node and the client.

3 Likes

Start with this and see how you get on
If you have a tcp port forward set up you can add
-- port <port number>

1 Like

If you are linux / osx then SN_LOG=all safenode will run the node. As long as you installed safeup and installed the node

step by step

curl -sSL https://raw.githubusercontent.com/maidsafe/safeup/main/install.sh | bash

- `safeup node -v 0.86.5`
- `export SAFE_PEERS="/ip4/206.189.21.247/tcp/45415/p2p/12D3KooWJa8AZnpp8uxUHHU7VJj1Stypm1Nt5bV3Yc62KnatRVFK"`
- `SN_LOG=all safenode`

That should be you running safenode
4 Likes

Thanks for your help. I normally find these sort of stuff easy to get, but I guess I’ll need to wait a bit.

I am on OSX and nothing happens.

4 Likes

Can you post the output you get on the screen? I’m not on OSX but we might be able to work through it.

2 Likes

Simply no return at all.

It might be running in the background, I don’t know.

I would actually go first with what @dirvine has recommended, which will start the node in the foreground, and in that case you should see its output directly. That way you can definitely tell it is running or doing something. You don’t need to have the logs go to file.

If you used someone else’s command to start the node in the background (using &), you can check if it’s running using pgrep safenode and you should see some logging output at /Users/<username>/Library/Application Support/safe/node/<peer id>/logs.

4 Likes

What @chriso said!

I was about to launch into how to run it in the background with ‘&’ and how to find the logfile if there is one but if the process is running you can find it with pgrep or ps -ef | grep safenode.

3 Likes

I was a few hours later to the party and started 50 nodes on a AWS t4g.medium.

CPU
The CPU usage also seems higher but then maybe that’s because there were a lot of uploads going on from folk. The CPU usage has settled down now. There were some very busy nodes yesterday evening.

RAM
For previous testnests I’d seen the ‘RES’ RAM usage for the safenode process being around 24MB. For this one is it a big range between 42MB and 131MB. The median is probably around 80MB.

Just spotted that I only have 40 safenode processes running now! Free RAM is only 250MB so I think it ran out of RAM in the night and the OOM killer stopped some of the safenode processes. Yes, a look through syslog confirms oom-killer has been killing safenode processes at various times.

Storage
I’m using a ‘Magnetic’ type AWS disk and I’m seeing IO Wait of 40% sometimes which is not very healthy so that will be a factor in helping to decide a sensible CPU, RAM, disk size, disk type, node count balance.

Record count and storage used
Attached are 2 files showing the count of records that I get using these oneliners:-

for i in {1..50} ; do echo $i ; ls /home/ubuntu/.local/share/safe/node/$i/record_store | wc -l ; echo ; done

for i in {1..50} ; do echo $i ; sudo du -sh /home/ubuntu/.local/share/safe/node/$i/record_store ; echo ; done

It’s between 300ish and 2048 Records per Node. (is 2048 the limit?)

It’s between 200ish and 800ish MB per Node.

du_records_202307080928.zip (953 Bytes)
count_records_202307080927.zip (758 Bytes)

13 Likes

That’s encouragingly straightforward. Upload and download is nice and simple. :+1:

Unclear why the download repeats the “The client still does not know enough network nodes.”, as if information is not retained.

Unclear the format of the upload log ~/.local/share/safe/client/uploaded_files/file_names_999
as that is on the back of user-friendly output to terminal as if it’s intended for user as a log.

The download of one file, seems to download all in the folder?

and I wonder using a protocol would be good practice as xor:// or safe:// if users are taking not of what those random strings are that helps set context for understanding… and for computers to parse those too.

trivially

In the OP - that should be safe files
or perhaps a identical behaviour when file is used?

==================
Node location is less obvious … have to run once to seek it out then at ~/.local/share/safe/node
and some upfront choice of where to put it would be good. Risk of /home partition filling up without that… or a stop start while the user fixed with a symlink or other hack.

The node barfs a lot of information to terminal that is hard to make sense of … was expecting that most would go to log and another simple user friendly abc would appear.

:thinking: should have used vdash

doh!

NAT status is determined to be private!
Error: We have been determined to be behind a NAT.

Maybe next time… :slightly_smiling_face:

5 Likes

I think you’re talking here about the location of the node’s data directories rather than the node binary.

If you want to use alternate locations, we already support that through the --root-dir and --log-output-dest arguments.

If safeup installed the node as a service, during the installation process we could give the user the opportunity to specify locations for those, if that’s the sort of thing you mean. But if you just want the binary, it’s going to be the user’s responsibility to set those to custom locations. Or maybe we could consider using a config file for safenode, and the values for the root directory and logging output location could be set based on values chosen during the install process.

8 Likes

It’s doing something now. Thanks. :slight_smile:

9 Likes

I am not entirely sure it is really able to connect, but the node is running.

[2023-07-08T12:05:26.772964Z TRACE sn_networking::event] Query task QueryId(178) returned with peers GetClosestPeersOk { key: [0, 32, 236, 247, 101, 85, 53, 101, 146, 101, 140, 88, 115, 107, 95, 149, 18, 121, 16, 191, 94, 53, 240, 93, 36, 96, 222, 144, 228, 23, 51, 96, 29, 230], peers: }, QueryStats { requests: 0, success: 0, failure: 0, start: Some(Instant { t: 216092781000750 }), end: Some(Instant { t: 216092781000750 }) } - ProgressStep { count: 1, last: true }

[2023-07-08T12:05:26.773021Z WARN sn_networking] Not enough peers in the k-bucket to satisfy the request

[2023-07-08T12:05:26.773029Z DEBUG sn_node::api] No network activity in the past 25s, performing a replication query

[2023-07-08T12:05:26.773037Z DEBUG sn_networking] Sending Cmd(RequestReplication(NetworkAddress::PeerId([0, 36, 8, 1, 18, 32, 238, 207, 97, 171, 124, 224, 110, 56, 157, 45, 238, 241, 105, 96, 249, 186, 118, 204, 234, 21, 142, 27, 242, 138, 116, 83, 172, 99, 186, 50, 34, 121]))) to self closest peers.

[2023-07-08T12:05:26.773053Z TRACE sn_networking] Getting the closest peers to NetworkAddress::PeerId([0, 36, 8, 1, 18, 32, 238, 207, 97, 171, 124, 224, 110, 56, 157, 45, 238, 241, 105, 96, 249, 186, 118, 204, 234, 21, 142, 27, 242, 138, 116, 83, 172, 99, 186, 50, 34, 121])

[2023-07-08T12:05:26.773145Z TRACE sn_networking::event] Query task QueryId(179) returned with peers GetClosestPeersOk { key: [0, 36, 8, 1, 18, 32, 238, 207, 97, 171, 124, 224, 110, 56, 157, 45, 238, 241, 105, 96, 249, 186, 118, 204, 234, 21, 142, 27, 242, 138, 116, 83, 172, 99, 186, 50, 34, 121], peers: }, QueryStats { requests: 0, success: 0, failure: 0, start: Some(Instant { t: 216092781223625 }), end: Some(Instant { t: 216092781223625 }) } - ProgressStep { count: 1, last: true }

[2023-07-08T12:05:26.773166Z WARN sn_networking] Not enough peers in the k-bucket to satisfy the request

[2023-07-08T12:05:31.472157Z DEBUG sn_logging::metrics] {“physical_cpu_threads”:8,“system_cpu_usage_percent”:34.869724,“system_total_memory_mb”:17179.87,“system_memory_used_mb”:2773.3157,“system_memory_usage_percent”:16.142822,“network”:{“interface_name”:“en0”,“bytes_received”:823296,“bytes_transmitted”:12288,“total_mb_received”:2115.037,“total_mb_transmitted”:1937.7306},“process”:{“cpu_usage_percent”:0.000991196,“memory_used_mb”:8.953856,“bytes_read”:0,“bytes_written”:0,“total_mb_read”:7.323648,“total_mb_written”:0.004096}}

[2023-07-08T12:05:36.477606Z DEBUG sn_logging::metrics] {“physical_cpu_threads”:8,“system_cpu_usage_percent”:37.699142,“system_total_memory_mb”:17179.87,“system_memory_used_mb”:2772.0334,“system_memory_usage_percent”:16.135359,“network”:{“interface_name”:“en0”,“bytes_received”:4096,“bytes_transmitted”:4096,“total_mb_received”:2115.0413,“total_mb_transmitted”:1937.7346},“process”:{“cpu_usage_percent”:0.0005436624,“memory_used_mb”:8.953856,“bytes_read”:0,“bytes_written”:0,“total_mb_read”:7.323648,“total_mb_written”:0.004096}}

[2023-07-08T12:05:41.483221Z DEBUG sn_logging::metrics] {“physical_cpu_threads”:8,“system_cpu_usage_percent”:31.520517,“system_total_memory_mb”:17179.87,“system_memory_used_mb”:2770.8743,“system_memory_usage_percent”:16.128613,“network”:{“interface_name”:“en0”,“bytes_received”:2179072,“bytes_transmitted”:15360,“total_mb_received”:2117.2205,“total_mb_transmitted”:1937.75},“process”:{“cpu_usage_percent”:0.0005686476,“memory_used_mb”:8.953856,“bytes_read”:0,“bytes_written”:0,“total_mb_read”:7.323648,“total_mb_written”:0.004096}}

[2023-07-08T12:05:46.487127Z DEBUG sn_logging::metrics] {“physical_cpu_threads”:8,“system_cpu_usage_percent”:35.12875,“system_total_memory_mb”:17179.87,“system_memory_used_mb”:2782.5603,“system_memory_usage_percent”:16.196632,“network”:{“interface_name”:“en0”,“bytes_received”:6144,“bytes_transmitted”:3072,“total_mb_received”:2117.2266,"total_mb_transmitt

4 Likes

I had one node receiving chunks yesterday but two I started today on the same machine remained empty.

Cleared everything out and started again with three more and I’m not getting chunks in any of them. These lines recur in the logs, With ProgressStep followed by Not enough peers in every case:

ProgressStep { count: 1, last: true }
[2023-07-08T12:28:58.060303Z WARN sn_networking] Not enough peers in the k-bucket to satisfy the request
1 Like

Ok, just to be certain. As it is behind NAT but with port forwarding, do I need to be concerned that it is not fully reachable.

Does this relate to @chriso suggesting opening the full range of tcp ports, I do however have one node running perfectly with only a specific port opened.

Edit: ahh I see @JPL is having a similar issue.

2 Likes