Royalties2 [Testnet 13/11/23 ] [Offline]

I think you should win something for that achievement :trophy: :joy:

8 Likes

Similar here. Only 2 nodes survived, 1node got past 10gb usage …

------------------------------------------
Timestamp: Mon Nov 13 18:51:12 EST 2023
Number: 0
Node: 12D3KooWCsbrfDaQQK5cxTJsXSgnkesWDv8oWPjeQiwqEyD8SZUo
PID: 2394
Memory used:
CPU usage:
ls: cannot access '/proc/2394/fd/': No such file or directory
File descriptors: 0
Records: 2048
Disk usage: 853MB
Rewards balance: 0.000000014
------------------------------------------
Timestamp: Mon Nov 13 18:51:12 EST 2023
Number: 1
Node: 12D3KooWGStEA4Eaz1TpSUHDwLk5ZDn7j29CTwczNJRSwhE2noDF
PID: 2432
Memory used:
CPU usage:
ls: cannot access '/proc/2432/fd/': No such file or directory
File descriptors: 0
Records: 2052
Disk usage: 859MB
Rewards balance: 0.000000000
------------------------------------------
Timestamp: Mon Nov 13 18:51:12 EST 2023
Number: 2
Node: 12D3KooWJfoSWNppjxusoQc9GtFr5VAkCkGwpVJJ4tFAUtnZfgkj
PID: 2403
Memory used: 11452.4MB
CPU usage: 84.1%
File descriptors: 688
Records: 2051
Disk usage: 847MB
Rewards balance: 0.000000022
------------------------------------------
Timestamp: Mon Nov 13 18:51:12 EST 2023
Number: 3
Node: 12D3KooWE78nZZPz2qUxzsHLqzBw3a565PkN5r4j1oL5HaYqTsje
PID: 2423
Memory used:
CPU usage:
ls: cannot access '/proc/2423/fd/': No such file or directory
File descriptors: 0
Records: 2056
Disk usage: 865MB
Rewards balance: 0.000000000
------------------------------------------
Timestamp: Mon Nov 13 18:51:12 EST 2023
Number: 4
Node: 12D3KooWF5AkwmxnAmks8qzzqqLSzyZHMXXjHAaiy7Zo1KtykDVD
PID: 2412
Memory used: 236.617MB
CPU usage: 21.6%
File descriptors: 721
Records: 2048
Disk usage: 859MB
Rewards balance: 0.000000000
------------------------------------------
5 Likes

It means an internal rust channel sending between threads is maxed out. Which means nodes are doing wayyyy more than they can cope with here.


I’ll be bringing whats left of this down! :bowing_man: Thanks everyone for digging in and reporting back findings!

13 Likes

Looks like we have a couple fixes in already and one to land soon. This sounds like something that could help the quickly filling nodes and the side effects of those:

I would be really happy if the next iteration comes soon.

7 Likes

Why the change of heart?

I didn’t get to play yesterday… :slightly_frowning_face:

4 Likes

“Don’t artificially push replication”

What does that mean?

Were we getting more replication than required, is that why nodes held so many records after very little time?

2 Likes

Previously, in the bygone years of a few weeks ago, we had this to ensure that replication was always on the go if there was something to replicate.

Now we replicate on PUT. (Which we never did) and we have gossip… The former drives along replication there. The latter in this last testnet was driving on replication due to what’s being removed in the linked PR. That itself may well have been causing a lot of load and is unneeded due to replicate on PUT.

(not to mention the intermittent replication which we also did not have when the push was added, i believe).

Were we getting more replication than required

It was being triggered wayyy more often than required.

is that why nodes held so many records after very little time?

After trimming down to pay one node, we’d increased the replication surface are (ie, replicate to CLOSE_GROUP + x). Which may also have been overkill given the other recent replication changes.

We were definitely causing too many nodes to feel responsible for data here. So that’s also been reduced in the PR there.

11 Likes

Thanks Josh!

6 Likes

FYI looking at my logs there are only three payment accepted notifications in them. Many PUTS, but almost no occurrence of “nanos accepted for record” in all of my node logfiles.

Anyone want to check their logs for this:

grep "nanos accepted for record" ~/.local/share/safe/node/*/logs/safenode.*
5 Likes

Aye, that rapid record expansion over the network is I think what’s caused the issue here. And with that, it would also be very very hard to upload chunks with prices changing so fast.

11 Likes

Thanks. I’ve taken note of this.

2 Likes