Besides introducing performance limits like I mentioned before
I see no reason that chunking in groups of 3 instead of trying to do them all as one big group would introduce performance limits. If anything, the user would be able to decrypt the files in 3MB chunks instead of waiting till the whole thing is downloaded (unless I’ve got the decryption process wrong?), which is essential for things like videos.
your method would reduce obfuscation
I don’t know, but if that’s a concern at 3 chunks, why not just increase the maximum chunk-grouping to something like 20? Surely any technology that can find 20 random compatible chunks would also render most current-method data vulnerable?
the desire to ensure that the security model is quantum computing resilient
I get how the chunk size would be relevant, but how does the number of codependent chunks relate to quantum computing?
larger files can give you much more to work with during the XOR pass such that it becomes (dare I say) impossible to extract any meaningful data
Yep, so an attacker could successfully reconstruct part of a file, or a network chunk could go missing and this wouldn’t kill an enormous file. I suspect the latter is more likely.
Since the choice of optimal trade-off between obscurity and redundancy is subjective, it makes a lot of sense to just let the network do this automatically based on file-size.
This makes sense. Instead of the codependent chunks equalling the file size, their sizes could be calculated based on the file size. Take a 2TB file for example - 2 1TB files would both have ridiculously high obfuscation, but the chances of the whole file being lost forever suddenly gets squared (significantly reduced). I suspect 3 chunks would be enough obfuscation, but if that’s wrong then any sensible standard for grouping codependent chunks based on file size would make sense to me. Larger files having larger groups of codependent chunks could be good.
Pre-chunking in might also negatively affect de-duplication.
I don’t see how… If anything, having a standard would reduce duplication, because non-standard techniques would be the cause of duplication. It could also reduce duplication in edge cases such as for files that have been truncated.
I guess the point I am trying to make is that you get both benefits with the current scheme, whereas introducing your concept could sacrifice some key properties while increasing complicatedness.
As I can’t see the added complication, the issue I see here is high obfuscation vs high resilience. Going back to the file-size-based-codependent-chunking idea, I think a balance could be found easily. Substantial segmentation and substantial chunk-grouping to get the best of both worlds.
It is messy. As @neo mentioned QuickPar or par2 on linux try to do this in a standardized way and are better than a couple of split and cat commands in a terminal or splitting up a zip or tar archive.
It’s not messy because it doesn’t even exist yet. What’s messy at the moment is the lack of consensus on the best way for users to split/merge files, which wouldn’t be an issue here. The clients would split and merge files anyway as that’s how chunking works. If a MaidSafe standard is described and implemented before the network goes live, then the messiness gets avoided entirely. As the network increases in popularity, some existing file split/merge issues will become redundant as traditional email attachment limits disappear.
specialized apps for specific file-formats that could achieve what you are asking
For things like compression, sure. Turn a .wav into .flac instead of zipping it. In this case though, the end-user would have to deal with so many differences between apps’ file-merge requirements. Much better to have a standard.
I’m still new to the forum and learning about the platform but these are my initial impressions that might address your points
You raise good points and seem to understand the platform pretty well : ) I appreciate the contribution.
Client is the code on your machine that provides the APIs to access the network.
So the client doesn’t really do features, just standard implementation, which an APP would do the prettier stuff?
I’m just gonna quickly suggest a few terms here:
- chunk group - a group of chunks, all of which are needed to decrypt the group (under the current system this chunk group would be the entire file)
- chunk group size - the number of chunks in a chunk group (my proposal is to have a standard but non-mandatory cap for this number)
- codependent chunks - chunks in the same chunk group