The idea of SAFE replacing the current internet is a seductive idea, but what does that really look like? This post is a bit of a back-of-the-envelope look at what the SAFE network would be like if it gained significant traction.
Total Data
In 2014 Facebook stored over 300 PB of data at a rate of 600 TB / day (source).
Let’s say it’s ten times more 4 years later in 2018, and then another ten times more to account for google / amazon etc.
That makes an estimate of 30 EB total and 60 PB of data per day.
Farmers
There are about 2 billion facebook users and about 1 billion wechat users. I think combining them to 3B users is a reasonable estimate for the number of people who might possibly be a farmer.
I don’t think it’s reasonable to use individuals as the measure here since companies with data centres will probably control the majority of farming resources (especially bandwidth), but anyway let’s go with the ideal ‘distributed among end users’ idea.
Initial Storage
Since chunk names are distributed very evenly, the distribution of data should be uniform for each farmer.
Assuming an average of 50 farmers per section, there are about 3B / 50 = 60M sections on the network.
30 EB / 60M sections = 500 GB per section
So every farmer would need to store around 500 GB.
This seems very achievable.
End-User Bandwidth
When storing new data, bandwidth requirements are fairly low.
60 PB per day / 60M sections = 1 GB / day per section = 0.1 Mbps
What is the ratio of read to write? Maybe 10 times more reads? That makes bandwidth 1.1 Mbps continuous (ie not accounting for diurnal peaks etc).
1.1 Mbps seems quite manageable (see Internet Speeds By Country). But this is just to service end users; there’s also the bandwidth consumption for network activity (eg churn, consensus, message routing etc).
Churn
When a vault starts, it would need to store 500 GB of data to join the section. That’s a lot of data! If this were to take one day it would need 50 Mbps continuous data flow.
This is supplied by all farmers in the section so at 50 farmer per section it should be about 1 Mbps additional data per farmer to bring a new vault into the section. Combined with end-user bandwidth of 1.1 Mbps, it’s about 2 Mbps.
This also happens when a vault is relocating.
This seems like quite a significant factor and puts some constraints around the desired frequency of relocating.
Considerations
Maybe users will run more than one vault at a time. This means bandwidth and storage per vault is less, but it doesn’t really reduce the per user requirements.
Some farmers will contribute a lot of resources and others not much. But a farmer that can’t ‘keep up’ with the section will be penalized, so it seems like there will be a tendancy for large resource providers to have an advantage in that sense.
Maybe not every vault in a section will store every chunk for that section. This would reduce the storage and bandwidth demands, but also reduce the redundancy and security of data.
Maybe joining and relocating will be a gradual process rather than get-all-the-data-right-now, which would lower the bandwidth demand. All the same, there is some need to complete the move eventually and it can’t take too long.
Summary
A network of 3B farmers would be looking at approximately 500 GB storage per farmer and 2 Mbps bandwidth consumption. That’s a surprisingly achievable target.
But do the assumptions hold up? There must be some holes or improvements to this reasoning. What do you think?