Chunks are limited to 1 MB so I thought it’d be interesting to see what the distribution of chunk sizes on the network might be.
Small files between 3 KB and 1 MB consume three chunks with each chunk one third the size of the file. Less than 3 KB is a single chunk the same size as the file (thanks @fred below for clarifying).
Large files are split into 1 MB chunks and each chunk (except the last) is 1 MB.
Looking at all the files in my $HOME
directory, this is the distribution I found:
Gathering Current User HomeDir Stats
Total files: 355401
Files larger than 1 MB: 2377 (29.206469 GB)
Files smaller than 1 MB: 353024 (19.043795 GB)
Total chunks: 768004
Large chunks: 28785
Small chunks: 739219
Chunk Size Count
0-100 KB 672286
100-200 62070
200-300 2753
300-400 570
400-500 303
500-600 608
600-700 229
700-800 103
800-900 131
900-1000 134
1000+ 28817
Obviously this will be different for everyone, and this machine does not have many media files so probably doesn’t represent the average user very well (I’d expect more 1000+ size chunks from an average user).
The total number of ‘medium’ files (between 100 and 1000 KB) only accounts for a small percent of chunks (about 10%). 90% of traffic comes from the very big and very small files.
So from a network traffic perspective, it looks like it’ll mainly be dealing with (by quantity)
- routing messaging (should be <100KB and large quantity but small size)
- small chunks (< 100KB, also large quantity but small size)
- medium chunks (between 100 and 1000KB)
- large chunks (> 1000KB, should be small quantity but large size )
The list is reversed when ordered by the amount of bandwidth consumed.
The main thing I think to take away from this is latency may be a big bottleneck. At this stage it’s just speculation but it starts to give some idea of what the ‘shape’ of content traversing the network might be like.
I was surprised how little the ‘middle of the spectrum’ counts toward network activity.