Good question, actually…
I just think that these millions of nodes don’t have the resources needed for any real action with the data. And when the data starts to flow, nodes start to drop, and there is not enough capacity and time to reorganize.
Or, the network works just fine from datacenters with no data flowing. But when data starts to move for real, the data center service providers decide that it is somehow “suspicious activity”, and cut the connection.
Or something else. Just a hunch, really, because the way the network is up is so much against the principles that I have been told are essential.
But, this is pure speculation.
And what would a “collapse” be in this situation? Maybe just a network jammed for a few hours, maybe some unimportant data lost, and then getting back on it’s feet again.
I think with 5 copies you will have a hard time to try and bother 3 million nodes…
This is the problem of data storage rewards not matching data upload costs.
A near empty network, should have tiny rewards. That would drive away big operators, as overheads would be too high.
For whatever reason, there is a push for network size growth. Maybe we will find out soon enough from the team.
I don’t understand, what do you mean?
This network is super strong and trying to bring it down is already a very challenging and expensive endeavour with questionable chance for success
Ps: stupid me… Twisted the numbers… wanted to go with 4tb to have nice 1 million chunks… But now it’s easier to just correct to 16tb of data…
Pps:
Nooooo
… It’s all screwed up… I just deleted the calculation since it doesn’t make sense…
Still, sudden drop of 20% of nodes would render 10% of files of 1.31GB lost. (Link to calculator)
Given the (likely) centralized nature of things at the moment, I would not render 20% drop all that unlikely. It’s not a collapse of the network, but a significant data loss nevertheless.
Nevermind. I’m too tired to understand anyway, going to sleep. Good night!
Not the case because excess copies exist and we happen to have significantly more than 5 copies in such an empty network
With the old replication system it would be potential for
0.2 ^ 5 == 1 in 3125 chunks lost completely. If files are averaging 10 chunks since small network and many web pages, blog post, test files then it could be lower than 1 in 31 files with a missing chunk to 1 in a 1000 files if most files are 3 chunks
Not 10% but 3 %
Now with the new replication where close neighbours are also storing the chunks and can retrieve them for clients and for replication its upto 70 nodes with each chunk. And since the network is empty this will be close to the 70 nodes holding each and every chunk.
So, if we take just 20 nodes instead of 70 then the maths is
0.2 ^ 20 == 1 in 95367 billion chance of 1 chunk lost
Too much then take at least 15 nodes.
0.2 ^ 15 == 1 in 30 billion chance of one chunk lost
and that is a instantaneous loss occurring in less than 5 minutes
That’s pretty impressive. I hadn’t realised it was quite so robust. Nice!
I don’t really doubt you, but is it really so? With that number I would expect a little more folks reporting that their nodes have at least some chunks.
But maybe the uploads just are so sparse?
…even before it was 3 million it was hundreds of thousands of nodes …
I have records with about 8 in 190 nodes on one sbc i checked yesterday.
over 130,000 chunks in 3 million nodes but i expect its more today
I think what @Toivo is getting at is cascading failures of overprovisioned nodes. One node operator or datacenter can go down no problem - we have redundancy. But when nodes are overprovisioned too much, the additional load from chunk redistibution can cause other node operations to drop, which causes more load to the remaining nodes and so on…
We saw that in testnets, but at that time the number of nodes, network load and network fullness was very different. So we will see I guess, because we cannot know how many nodes are overprovisioned and how much.