Impressive chunk based compression libraries

Zstd, igzip, and libdeflate are interesting compression utilities.

LibDeflate can be even faster than zstd, another performant library, for certain types of data. It is optimized for chunk compression (ex. 1MB). Could be perfect for self-encryption. It has rust bindings.

"Libdeflate is optimal in applications that have the input data up-front, or when (large) input datasets can be split into smaller chunks "

Update: The igzip library from intel is fast. I wonder how it performs on amd?

11 Likes

Here are some benchmarks. Libdeflate is rather impressive but igzip was not bested when it comes to brute speed. I added igzip to the OP.

2 Likes

Does SAFE currently use file compression? I didn’t think it did. If it does then guessing compression would have to come first, then chunking, then self-encryption.

Iirc, compression is built into the self-encryption concept/process. It may not be yet implemented though. It makes no sense to not compress the chunks. Bandwidth is almost always the limiting factor, and compute is cheap compared to that. Read the docs :nerd_face:

5 Likes

YEs self encryption does this

→ chunk file
→ Compress chunk
→ take hash
→ use the next chunk hash to encrypt the next chunk
→ Use previous 2 hashes and XOR the next chunk
→ Hash content and make this hash the name of the chunk

So this cyclic process (where the previous 2 chunks data are used to encrpt and xor) continues to the end of the file. At the end the first 2 chunks can finalise and be uploaded too.

From perpexity
MaidSafe’s self-encryption works by employing a system that automatically splits, renames, encrypts, and compresses data using algorithms. This process is based on the data itself, requiring no user intervention or passwords. The encrypted data is then dynamically stored at locations selected by the network, aiming to provide a high level of security without the need for user involvement
1

3
. The self-encryption feature is part of the MaidSafe network, which is characterized by its innovative approach to privacy, security, and freedom for its users
3
. Additionally, the self-encryption system is implemented through a library that provides secure encryption of data, with the encrypted chunks considered as safe as those encrypted by any other modern method
4
.

14 Likes

Do the current test nets check if a chunk has been encrypted?

2 Likes

As long as it is the client who performs the self-encryption, it is not possible for the nodes to know whether the chunk is encrypted or not.
What the nodes do, in the latest testnet, is encrypt the chunk before storing it on disk.

As for compression, self-encryption use brotli.

6 Likes

5 posts were split to a new topic: Proof of Encryption

Looks like brotli edges out deflate in this study, but this may not be an accurate comparison for the optimized library by E. Biggers mentioned in the op…

1 Like
1 Like
1 Like