Ah, rusty indeed!
Successfully parsed transfer.
Verifying transfer with the Network...
Successfully verified transfer.
Successfully stored cash_note to wallet dir.
Old balance: 0.000000000
New balance: 100.000000000
Ah, rusty indeed!
Successfully parsed transfer.
Verifying transfer with the Network...
Successfully verified transfer.
Successfully stored cash_note to wallet dir.
Old balance: 0.000000000
New balance: 100.000000000
Message to all
Please do not be discouraged by the hassles we are having today.
Always, always remember – If this was easy, any bugger could do it.
And thank you sir as well!
Successfully parsed transfer.
Verifying transfer with the Network...
Successfully verified transfer.
Successfully stored cash_note to wallet dir.
Old balance: 100.000000000
New balance: 120.000000000
3 … (these are big logs)
clientlog-3.zip (3.0 MB)
My 100/100 Mbit was maxed out by single node.
It is self-DDoS. I’m stopping this thing.
and 4
clientlog-4.zip (5.8 MB)
My nodes:
------------------------------------------
Timestamp: Mon Nov 13 12:43:49 EST 2023
Number: 0
Node: 12D3KooWCsbrfDaQQK5cxTJsXSgnkesWDv8oWPjeQiwqEyD8SZUo
PID: 2394
Memory used: 360.449MB
CPU usage: 39.9%
File descriptors: 1308
Records: 2048
Disk usage: 855MB
Rewards balance: 0.000000014
------------------------------------------
Timestamp: Mon Nov 13 12:43:50 EST 2023
Number: 1
Node: 12D3KooWGStEA4Eaz1TpSUHDwLk5ZDn7j29CTwczNJRSwhE2noDF
PID: 2432
Memory used: 402.512MB
CPU usage: 25.0%
File descriptors: 1422
Records: 2048
Disk usage: 838MB
Rewards balance: 0.000000000
------------------------------------------
Timestamp: Mon Nov 13 12:43:50 EST 2023
Number: 2
Node: 12D3KooWJfoSWNppjxusoQc9GtFr5VAkCkGwpVJJ4tFAUtnZfgkj
PID: 2403
Memory used: 274.691MB
CPU usage: 14.2%
File descriptors: 1336
Records: 1611
Disk usage: 657MB
Rewards balance: 0.000000022
------------------------------------------
Timestamp: Mon Nov 13 12:43:50 EST 2023
Number: 3
Node: 12D3KooWE78nZZPz2qUxzsHLqzBw3a565PkN5r4j1oL5HaYqTsje
PID: 2423
Memory used: 390.941MB
CPU usage: 34.8%
File descriptors: 1449
Records: 2048
Disk usage: 862MB
Rewards balance: 0.000000000
------------------------------------------
Timestamp: Mon Nov 13 12:43:50 EST 2023
Number: 4
Node: 12D3KooWF5AkwmxnAmks8qzzqqLSzyZHMXXjHAaiy7Zo1KtykDVD
PID: 2412
Memory used: 175.203MB
CPU usage: 12.2%
File descriptors: 900
Records: 0
Disk usage: 4.0K
Rewards balance: 0.000000000
------------------------------------------
Absolutely.
But as well as that we need to make sure we have enough RAM on our machines for the number of Nodes we want to run. I thought I was playing it safe starting only 10 on a 4GB machine that I’d have started 40 on before. It wasn’t to be so I had to kill them and start just 5. So that will have contributed to the join and leave chaos.
That 5 is now using between 199MB and 244MB each with between 90 and 140MB free on the machine. So 4GB = 5 nodes. Hopefully. Anything more ambitious leads to disappointment and more node chaos.
FWIW, its been only 2 hours so far with the safenode
still running, I have not seen this high level of network traffic from a safenode
in a long time so quickly!
It could be seen as efficiency gains in being able to have higher throughput, or it could also be seen as more time spent in areas of replication and underlying transport protocols than possible expected, or in general its doing what is expected due to churn and other network events happening all too quickly (as expected), I am not really sure, .
As and when MaidSafe team has some spare time (if any lol), I would like to hear some feedback on the LIBP2P stats panels noted above and in the the prior HeapNetTestnet, specifically around:
Why is KAD Query Result Count for Success and Failure scenarios exactly the same #? How should one be viewing or interpreting this data from metrics
endpoint specifically around LIBP2P area?
Does anything seem a bit abnormal here to anyone else? For instance, the GetRecord (success & failure) have the same # of total items in the histogram. I may not be charting it properly or the source data might be off, but for now, I won’t be able to dig into this further for at least a few more hours today, so in mean time, I will let it collect a bit more data (assuming testnet still continues), and then do a deeper dive a bit later in the day. Thanks!
Note: For Panels with no data, I will double check if those fields are still valid in the metrics
endpoint or the underlying code base has changed in safenode
pid.
Yes, in retrospect I should have started with fewer nodes as well as remembering to delete the client directory.
But when a testnet starts I don’t think. Maybe the quickstart section can include this in future:
@happybeing slow down!! Remember to pkill your nodes, delete your client directory, and only run five to start with. After all, you can always add more later with the latest vdash.
I underestimated too.
To be fair the OP did warn us about higher expected RAM.
So I halved my usual deployment and still got a pretty good beating.
That is basically what Maidsafe did too, doubled the resources for their nodes.
I suspect they must have lost a fair amount too
More than an hour later, I have only managed to upload another 60 chunks from home.
I think this one may be pining for the fjords…
Could it be the network is instinctively republican and simply CBA with this “Royalty” shit?
As before, smaller uploads seem to do better
🔗 Connected to the Network Chunking 1 files...
Input was split into 12 chunks
Will now attempt to upload them...
Uploaded 12 chunks in 1 minutes 47 seconds
**************************************
* Payment Details *
**************************************
Made payment of 0.000000000 for 12 chunks
New wallet balance: 99.999482472
**************************************
* Uploaded Files *
**************************************
"The Beat Farmers - Powder Finger-.mp3" 5bfca54d625f35980e179843627db3758f365c092b32194bb9cd8680ef31a04e
POSSIBLE BUG!!!
Made payment of 0.000000000 for 12 chunks
EDIT:
logs at https://file.io/2x7nqgx2Ku9v
or eating our own dogfood
“safe.log” 42275d3fda4a46aac4e31066e4167630f70690187c1e6edc9def64bbcdfabd73
And unsurprisingly
🔗 Connected to the Network Downloading The Beat Farmers - Powder Finger-.mp3 from 5bfca54d625f35980e179843627db3758f365c092b32194bb9cd8680ef31a04e
Error downloading "The Beat Farmers - Powder Finger-.mp3": Network Error Record was not found locally.
iv had to restart multiple vps’s killing several hundred nodes
the systems had ground to a standstill if she falls over my bad.
Try restarting with fewer nodes per vps.
I am about to kill 50 and restart with 20.
Have a go with the latest client. That should be fixed there.
Storing Spends counts increases your store cost, but are free to store.
In general looks like we’ll have to try out some other angles for reducing gossip mem (if there is not a bug here). We’ve got a few more ideas, but things were looking much better at this for our internal testing… clearly we need to crank that up a notch…
That said taking a peek at our heavy heavy nodes. I am also seeing soooo many records stored. And think it may well be replication tweaks causing this load…
Leaving this up for now.
The safenode
pid just died due to OOM (hitting its upper limit available RAM on the LXC) on my node.
I condensed the above graph on areas I thought were a bit interesting.
Few observations:
Request Response Sent
under the SN_Networking:Event
drop to 0, and stay there.metrics
endpoint over the whole time frame, and data was flowing into the record_store
since 16:40 UTC (well before the rampant memory rise):Chunk Deleted
# on this testnet, while prior testnets were at 0metric
endpointAh yes, further to the above note to @southside, you can get chunks but if its replicated data you would not be paid. So that adds up.
Certainly seems to me like we’re seeing more replication than we can cope with at the moment.