Royalties2 [Testnet 13/11/23 ] [Offline]

Ah, rusty indeed!

Successfully parsed transfer.
Verifying transfer with the Network...
Successfully verified transfer.
Successfully stored cash_note to wallet dir.
Old balance: 0.000000000
New balance: 100.000000000

:pray: :pray:

1 Like

Message to all

Please do not be discouraged by the hassles we are having today.

Always, always remember – If this was easy, any bugger could do it.

4 Likes

And thank you sir as well!

Successfully parsed transfer.
Verifying transfer with the Network...
Successfully verified transfer.
Successfully stored cash_note to wallet dir.
Old balance: 100.000000000
New balance: 120.000000000
2 Likes

3 … (these are big logs)

clientlog-3.zip (3.0 MB)

My 100/100 Mbit was maxed out by single node.
image
It is self-DDoS. I’m stopping this thing.

4 Likes

and 4
clientlog-4.zip (5.8 MB)

My nodes:

------------------------------------------
Timestamp: Mon Nov 13 12:43:49 EST 2023
Number: 0
Node: 12D3KooWCsbrfDaQQK5cxTJsXSgnkesWDv8oWPjeQiwqEyD8SZUo
PID: 2394
Memory used: 360.449MB
CPU usage: 39.9%
File descriptors: 1308
Records: 2048
Disk usage: 855MB
Rewards balance: 0.000000014
------------------------------------------
Timestamp: Mon Nov 13 12:43:50 EST 2023
Number: 1
Node: 12D3KooWGStEA4Eaz1TpSUHDwLk5ZDn7j29CTwczNJRSwhE2noDF
PID: 2432
Memory used: 402.512MB
CPU usage: 25.0%
File descriptors: 1422
Records: 2048
Disk usage: 838MB
Rewards balance: 0.000000000
------------------------------------------
Timestamp: Mon Nov 13 12:43:50 EST 2023
Number: 2
Node: 12D3KooWJfoSWNppjxusoQc9GtFr5VAkCkGwpVJJ4tFAUtnZfgkj
PID: 2403
Memory used: 274.691MB
CPU usage: 14.2%
File descriptors: 1336
Records: 1611
Disk usage: 657MB
Rewards balance: 0.000000022
------------------------------------------
Timestamp: Mon Nov 13 12:43:50 EST 2023
Number: 3
Node: 12D3KooWE78nZZPz2qUxzsHLqzBw3a565PkN5r4j1oL5HaYqTsje
PID: 2423
Memory used: 390.941MB
CPU usage: 34.8%
File descriptors: 1449
Records: 2048
Disk usage: 862MB
Rewards balance: 0.000000000
------------------------------------------
Timestamp: Mon Nov 13 12:43:50 EST 2023
Number: 4
Node: 12D3KooWF5AkwmxnAmks8qzzqqLSzyZHMXXjHAaiy7Zo1KtykDVD
PID: 2412
Memory used: 175.203MB
CPU usage: 12.2%
File descriptors: 900
Records: 0
Disk usage: 4.0K
Rewards balance: 0.000000000
------------------------------------------
3 Likes

Absolutely.

But as well as that we need to make sure we have enough RAM on our machines for the number of Nodes we want to run. I thought I was playing it safe starting only 10 on a 4GB machine that I’d have started 40 on before. It wasn’t to be so I had to kill them and start just 5. So that will have contributed to the join and leave chaos.

That 5 is now using between 199MB and 244MB each with between 90 and 140MB free on the machine. So 4GB = 5 nodes. Hopefully. Anything more ambitious leads to disappointment and more node chaos.

7 Likes

FWIW, its been only 2 hours so far with the safenode still running, I have not seen this high level of network traffic from a safenode in a long time so quickly!

It could be seen as efficiency gains in being able to have higher throughput, or it could also be seen as more time spent in areas of replication and underlying transport protocols than possible expected, or in general its doing what is expected due to churn and other network events happening all too quickly (as expected), I am not really sure, :man_shrugging: .

As and when MaidSafe team has some spare time (if any :smiley: lol), I would like to hear some feedback on the LIBP2P stats panels noted above and in the the prior HeapNetTestnet, specifically around:

Why is KAD Query Result Count for Success and Failure scenarios exactly the same #? How should one be viewing or interpreting this data from metrics endpoint specifically around LIBP2P area?

Does anything seem a bit abnormal here to anyone else? For instance, the GetRecord (success & failure) have the same # of total items in the histogram. I may not be charting it properly or the source data might be off, but for now, I won’t be able to dig into this further for at least a few more hours today, so in mean time, I will let it collect a bit more data (assuming testnet still continues), and then do a deeper dive a bit later in the day. Thanks!

Note: For Panels with no data, I will double check if those fields are still valid in the metrics endpoint or the underlying code base has changed in safenode pid.

4 Likes

Yes, in retrospect I should have started with fewer nodes as well as remembering to delete the client directory.

But when a testnet starts I don’t think. Maybe the quickstart section can include this in future:

@happybeing slow down!! Remember to pkill your nodes, delete your client directory, and only run five to start with. After all, you can always add more later with the latest vdash.

1 Like

I underestimated too.

To be fair the OP did warn us about higher expected RAM.
So I halved my usual deployment and still got a pretty good beating.

That is basically what Maidsafe did too, doubled the resources for their nodes.

I suspect they must have lost a fair amount too

1 Like

no luck with fauset

And also some problem installing the node.

1 Like

More than an hour later, I have only managed to upload another 60 chunks from home.

I think this one may be pining for the fjords…

Could it be the network is instinctively republican and simply CBA with this “Royalty” shit?

As before, smaller uploads seem to do better

🔗 Connected to the Network                                                     Chunking 1 files...
Input was split into 12 chunks
Will now attempt to upload them...
Uploaded 12 chunks in 1 minutes 47 seconds
**************************************
*          Payment Details           *
**************************************
Made payment of 0.000000000 for 12 chunks
New wallet balance: 99.999482472
**************************************
*          Uploaded Files            *
**************************************
"The Beat Farmers - Powder Finger-.mp3" 5bfca54d625f35980e179843627db3758f365c092b32194bb9cd8680ef31a04e

POSSIBLE BUG!!!

Made payment of 0.000000000 for 12 chunks

EDIT:
logs at https://file.io/2x7nqgx2Ku9v

or eating our own dogfood
“safe.log” 42275d3fda4a46aac4e31066e4167630f70690187c1e6edc9def64bbcdfabd73

And unsurprisingly

🔗 Connected to the Network                                                                                                                                                  Downloading The Beat Farmers - Powder Finger-.mp3 from 5bfca54d625f35980e179843627db3758f365c092b32194bb9cd8680ef31a04e
Error downloading "The Beat Farmers - Powder Finger-.mp3": Network Error Record was not found locally.
4 Likes

iv had to restart multiple vps’s killing several hundred nodes :frowning:
the systems had ground to a standstill if she falls over my bad.

2 Likes

Try restarting with fewer nodes per vps.
I am about to kill 50 and restart with 20.

1 Like

Have a go with the latest client. That should be fixed there.

Storing Spends counts increases your store cost, but are free to store.


In general looks like we’ll have to try out some other angles for reducing gossip mem (if there is not a bug here). We’ve got a few more ideas, but things were looking much better at this for our internal testing… clearly we need to crank that up a notch…

That said taking a peek at our heavy heavy nodes. I am also seeing soooo many records stored. And think it may well be replication tweaks causing this load… :thinking:

Leaving this up for now.

9 Likes

The safenode pid just died due to OOM (hitting its upper limit available RAM on the LXC) on my node.

I condensed the above graph on areas I thought were a bit interesting.

Few observations:

  • Between 18:10 and 18:20 UTC, the memory increased super rapidly (continuously) (6GB+)
    • During the start of this time period, Chunk Read & Get Request messages in the logs file were being logged the most out of any other message types being parsed currently
    • Right after this time period, it was interesting to see messages associated with Request Response Sent under the SN_Networking:Event drop to 0, and stay there.
  • Interesting to observe that wallet balance only went up once (step function), even though the PUT Record stats were 90% Chunk and 10% Spend from metrics endpoint over the whole time frame, and data was flowing into the record_store since 16:40 UTC (well before the rampant memory rise):
  • Interesting to see it log Chunk Deleted # on this testnet, while prior testnets were at 0
  • Interesting to see ‘SN Replication Triggered’ remain at 0 for the total count on metric endpoint
5 Likes

Ah yes, further to the above note to @southside, you can get chunks but if its replicated data you would not be paid. So that adds up.

Certainly seems to me like we’re seeing more replication than we can cope with at the moment.

7 Likes