To get the actual percentage would be an exercise in the actuarial sciences - having to figure in human error such as fires or power loss and the like - similar to the way risk is calculated in insurance. So it’s a number that is hard to calculate as well as ever changing, however, I do think that we will find that it’s well below 1%
.
Something that you may not be aware of is “Sacrificial Data”. While the Network is tasked with maintaining a minimum of 4(?) copies of any given chunk, there very well may be many more than that stored on the Network.
Sacrificial Data was created in order to measure the total available space of the Network.
This should drive that 1% number down dramatically. So let’s see what we have in your hypothetical so far:
- 100,000 files, each 1GB big = 102,400,000 Chunks [1]
- 102,400,000 Chunks, each stored twice for Primary Chunks = 204,800,000 Primary Chunks
- 102,400,000 Chunks, each stored twice for Secondary Chunks = 204,800,000 Secondary Chunks
- Primary Chunks + Secondary Chunks = 409,600,000 Non-Sacrificial Data Chunks
But we’re missing something.
The amount of excess space in the Network will determine the amount of Sacrificial Chunks that are out there. Calculating a probability without considering the change in this value inherent to the system would be foolish. So we’ll have to calculate everything three times - for the two extremes, as well as the mean.
- Sacrificial Data = Chunks * 2 (optimal - Network is balanced)
- Sacrificial Data = Chunks * 1 (mean/average - Network needs more space, but has enough to add more data)
- Sacrificial Data = Chunks * 0 (disastrous - Network is full)
Would you please provide me the totals of all Chunks in existence in each of those three scenarios while I start working on the next bit?
[1] Keep in mind that since the files are all 1GB big, we don’t have to worry about splitting a 1MB file up into 3 x 1MB chunks - achieved by padding. We can investigate this later if we wish.
P.S. Nice link to WolframAlpha. That’s a great way to show computations.