New network going up now, testing 4MB chunks. Time to reset your nodes

My internet is really getting hammered now. Connections went from about 23000 to somewhere above 30000, the router max. Interfering with normal household web usage now. First time since new network launched on Thursday.

Don’t get me wrong - I’m not complaining I can’t run as many nodes as before. I’m still providing storage to the network and at as much bandwidth as I can sensibly use. And the nanos are flowing. And you are clearly right that it will be much the same amount of data going in each direction. It’s just the peaks are now a bigger factor and that will come into play for sites with more limited bandwidth. And it’s just a happy accident that it makes it less attractive to run lots of nodes in DCs or Cloud.

And to @Dimitar 's point fewer nodes possible with a certain bandwidth doesn’t just limit the number of nodes that an ‘average’ home user can run like from 100 to 50 or 10 to 5 it will also cut off at the legs people who could just about run 1 before.

And to @neo 's point there will be shunning. Much shunning.

Thank you for the reply. It’s good to see you are still as engaged with the community now you are on the other side of the fence!

2 Likes

It seems as if network size dropped by 80% just moments ago. Expect way higher peak demand temporarily if it manages to replicate all those relevant records out to other nodes in time, :crossed_fingers: .

Regardless, its good to see this huge downturn, and its impact on peak resources.

Our safenodes’ havent crashed though memory and cpu are up dramatically (not max’ed out).

Not sure on the cause of the 80% network size drop in size, possibly a power user? Not sure yet (more to investigate).

4 Likes

I saw that on this page:-
https://network-size.autonomi.space

Is it real though? What if it’s the person who runs that site (I don’t remember who that is) disconnected their 20k nodes and it’s only gone down by 20%?

Could it really be the case that 80% of the nodes were being run by one person who pulled the plug?!

1 Like

I accidentally killed ~50 nodes due to a balls-up relocating /var/lib/docker to a its own much larger logical volume about 10-15 mins ago – but I dont think I am to blame.

Has anyone seen @aatonnomicc ?

1 Like

No but I am still unconvinced that a “power” user could not cause a cascading failure if they took out 10-15% of the nodes at this time.

3 Likes

Ouch, my handful of machines that were cruising this morning are sweating now.

So whomever had a large chunk of the network and packed in too many nodes got wrecked I guess.

Second time now.
And if they are running that many they are banking big on rewards while screwing the network over when they go tits up.

We need to ensure adequate resources are provided or greed will wreck havoc.

3 Likes

That is a very valid point. Most of the nodes going missing seems to have taken place between 1700 and 1720. That would be a really quick cascade.

The likely culprit would be RAM usage spiking I think. I saw a node reported in Vdash as peaking at 514MB. If a lot of people were running close to the limit of their RAM there could have been mass node death.

Surely if it were CPU or bandwidth or connections hitting a limit it would just be a vast amount of shunning and a gentler slope down as people killed nodes.

2 Likes

Whatever the root cause is determined to be, this should end all talk of any early launch until investigations and fixes/mitigations are complete.

3 Likes

Maybe the Launcher, and maybe the Node Manager as well, needs to do a resource check: can I claim:-

  • 500MB of RAM per node being requested to launch in case of peaks like this. (I saw a node at 514MB)
  • a number of MIPs per node
  • is 35GB of storage per node available (if we want to go down that route)

If the answer is ‘no’ report on what is lacking and don’t launch the nodes. Feels a bit constrictive but if it makes the difference between a network that is able to survive and isn’t maybe it’s necessary.

3 Likes

The thing is folk who are running tens of thousands of nodes can get on without node-manager or launchpad.

So if the network can enforce or query then shun that minimum specs are provided somehow it needs to. Or @neo proposal will help with the over provisioning of nodes.

2 Likes

Let’s just take a breath and wait and see what happened.

6 Likes

Sorry @chriso , the point is we should not have cause to wait and see what happened at this stage of the game.
This will not look good to our prospective partners.

2 Likes

We need to just wait and see. People can get too dramatic on here sometimes. Anyway, I won’t say any more.

4 Likes

strangely absent, surely just a coincidence in timing ? :thinking: I hope he’s ok and not suffering too much aggro from his router, or worse :cry: :fire_extinguisher: :couch_and_lamp:

Capture

5 Likes

I’m shocked.
And with that, I will flounce off.

TBPHWY, losing 75% of the network seems somewhat dramatic, in and of itself, no?

For whatever reason.

3 Likes

Just some perspective
I guess, I am driving past peoples worldly possessions piled up on the sidewalk. Perhaps Chris has a point. :slightly_smiling_face:


Brief detour, now back on topic… where is that @aatonnomicc?

11 Likes

Maybe he ran away after he wrecked the network :thinking:

5 Likes

Or he is hunting :wink:

2 Likes

The one who did it xD

4 Likes