Convergent encryption ...?

Is SAFE network vulnerable to the “learn the remaining information attack”?

This attack is described there:

Some folks over here were worried about it:

1 Like

I don’t think so.

As I was told when I brought up something similar, chunks get encrypted (again) before they are sent to a vault for storage, so chunks (while at rest) don’t resolve to their actual hash, and not even the vaults themselves know what’s on them. There are “storage manager” nodes (or what they’re called) that arrange where chunks are stored. Basically, information is dispersed in a way that it’s extremely hard to get to it from the opposite direction than how it’s intended.

Sorry to be a little vague, and please somebody correct me if I’m wrong.

2 Likes

It can if you can tell who/what is asking for the information and that is where the problem lies for the attacker. In terms of the actual attack, if a user were to ask a server for data that is encrypted using a convergent encryption algorithm (as self encryption is) then you have a problem. If the attacker can see the user and the requests for the data then the attack works.

In our case we obfuscate requesters holders and transmitters of the data.

In cases where this may not even be enough then you can “salt” your data that you want to store, but who you give the salt to then is an issue.

Sharing/sending data between two parties is only known to those two.

So the short answer is no not quite, but there are ways with an awful lot of resources that some data “may” be identified.

IMHO these ways are all able to be mitigated at varying costs, but can be with not much effort. In basic form the network protects very well against this, but can do so with even more veracity if required.

Hope that helps, I am assuming the known plaintext attack as I have not read the article yet. Forgive any brevity as answering a lot on and off line right now :smiley:

4 Likes

No, the immutable data, the ones who uses deduplication, are encrypted only with is own contents, without any user key. But this data is, before encryption, split in chunks and the encryption of a chunk is derived by the data of neighbour chunk, not is own contents. In addition, each chunk, after encryption, is obfuscated by an XOR function with data derived from the neighbour chunk.

3 Likes

Well look at that. I didn’t know this attack was already formalized. :hushed:

I discussed this attack (or something very similar) with David about two weeks ago. As David said, the endpoints are the weakness.

See:

For Uploads:

POTENTIAL SOLUTION

The idea is to have the Client ask the Close Group for the Data Managers public key. The Client then self encrypts the chunks first with the Data Managers’ public key, second with the Close Group public key, and finally with the Clients public key. During transport to the Vaults, each layer is peeled off like an onion by the respective Groups/Managers. Problem solved. I hope.

For Downloads:

POTENTIAL SOLUTION

The Data Managers salt the chunks, encrypts the chunks with the Clients’ (the requester) public key, then finally encrypts each chunk with the Close Groups’ public key. The first layer is peeled off by the close group, the second is decrypted by the Client before finally discarding the salt and joining the chunks together to reconstruct the complete file. Solved?

@dirvine We’ve discussed this before but I’m still unclear if this is confirmed to be both possible and planned for version 1.0 of the network. Maybe as an optional feature?

There are a few attacks being discussed here. In terms of the OP, then no we are in the clear here. The network will respond with data if you ask for it, but you cannot know the holders. Imagine the DataManager is the line where the client can see to. Across that line are the node managers and nodes. The client does not see them.

In terms of scanning your own store for data then we did have in the c++ code but not yet reimplemented a scheme where the data is encrypted (obfuscated_ again before store, so the node does not know what it holds. This is for other types of attacks.

All of these need you to have a node in the network that data passes through, which you may have. These nodes it passes through do not know who gave it or who asked for it. There will be an XOR delivery address and this can be a throw away address. The method I described hid that content from the only node that knows the IP address of that client with the throw away address.

I am not sure of an attack at the moment that de-anonymises the client or allows the attack in the OP. It is very different on a p2p network as opposed to a server based network. The coverage area is pretty large. So maybe I am missing something in the question? I feel I am.

1 Like

@dirvine, attacker can use content of the file to associate it with the holder.

For example, assume that company creates XML document with invoice and stores it on the SAFE network. This XML document could have fairly rigid structure and known values (for issuer/recipient, etc.). However, just the fact that such document exists on the SAFE network associates it with specific entities from the real world. There’s no need to associate document with the entity at the protocol level.

Someone on the XRP forum claims a mod on Reddit deleted this question from the MAIDSAFE sub. Any Reddit mods here have a comment?

@happybeing comment: I think @dirvine and I are the only active reddit mods and neither of us would delete something like that. David found it in the mod queue because reddit thought it looked like spam. It is now approved. Thanks for alerting us to it. :slight_smile:

Here are some graphics to help explain my understanding of the attack and potential solutions:

POTENTIAL UPLOAD SOLUTION:

POTENTIAL DOWNLOAD SOLUTION:

It seems that file requests alone are also vulnerable to this attack. One way to solve it would be to have the Client encrypt the file request with the Close Group public key. I’m not entirely sure, but It might be possible for the attacker to obtain the Close Group public key. Enabling the attacker to carry out the very same fingerprint/ probe attack on the file request. If the Client were to couple the request with some random data, this attack could be mitigated. Any good?

2 Likes

Maybe I’m not understanding this correctly. I’ll take a shot anyway.

If the company were to store this XML file you speak of onto the SAFE network, It would make sense that they would store it privately. Nobody other than the one who uploaded it would know that it exists on the network. The uploaders’ credentials would be necessary to retrieve the file and decrypt it. If the company stored a sensitive file as public data on the network, the last thing they should be worried about is correlation. Mistakes like this could be averted by a confirmation prompt for all uploads destined to enter public domain.

Brute forcing other peoples private data stored on your vault is the only way you’ll get a tiny glimpse of what is stored on the network as a whole. Having data broken into many pieces makes this especially difficult as you would first need to gather all of the related chunks.

Files smaller than 1MB are bundled with the users’ datamap of all of his/her files. Those I guess would be the easiest to brute force if at all possible with current and near future processing power. Once that’s achieved, from there is should be fairly strait forward to gather the chunks specified in the datamap and use the brute forced key to decrypt them. All of this is very difficult and likely impractical. Hope I helped to clarify. :relaxed:

Welcome to the forum! :slight_smile:

Hasn’t david already mentioned this fix

Second to last hop is the relay a.k.a attacker. Doesn’t matter if he follows the rules and encrypts the chunk with his own public key. He will still be able to see the chunk before he encrypts it to the Client.

The issue here is that the relay will see the chunk encrypted only with the clients public key. So it’s a self encrypted chunk wrapped in the clients public key encryption. All the attacker would have to do is fingerprint all of the chunks passing through his machine encrypted with the clients public key, then compare all of the fingerprints of the chunks he observes with the ones he has in his database of “illegal” files. This would of course only be possible if the attacker first encrypts his own gathered (the ones he found publicly available on the SAFE network) chunks with the clients public key.

Please reread and reread and reread the information provided in the first graphic. I don’t know how else to further simplify this. I might be missing something but I’ve yet to be given clarifying information. Until then, I can only continue seeing this as real attack vector with serious implications.

If what David meant is that the chunk is encrypted with the Close Groups public key before passing it to the relay then I’d be happier. Although, I wonder if the attacker can get the Close Groups public key. It’s unlikely that the attacker would be using the same Close Group as the Client he is relaying traffic for. So I suppose the attacker wouldn’t have the privilege of knowing the public key of the clients Close Group. I don’t know. Please clarify.

Your diagram say the relay node

SAFE ---> relay ---> user

David says

SAFE ---> 2nd to last hope --> last hop (relay) --> user
          encrypts packets --> encrypted        --> user decypts packets then decrypts chunk

So the relay does not know the key to decrypt the chunk passing through him to the client, thus it doesn’t matter if he knows the keys to the chunk being watched for. All chunks he relays is encrypted with a key he does not know.

The client then decrypts the packets as they come, which the relay does not know and then uses the self encryption keys to decrypt the chunk.

Is this not the attack the relay node knows the self encryption keys of the chunks being watched for and when they are the relay node and see the chunks they ae watching for they catch the user. David suggests as I thought you did too, encrypts the packets so that teh relay node cannot decrypt and watching is defeated

Or am I missing something???

1 Like

I hope you’re not. I assume this means that the last hop (relay) has no way of determining the public key of the 2nd to last hop. If this is so, then I’m clear.

Now what about uploads and file requests. Are they too encrypted with the public key of the 2nd to last node? Can you answer with certainty?

1 Like

What cases might it not be enough?

What is to be mitigated? What else can be done for this higher level of protection to be achieved? I’m very curious. :open_mouth:

Say you calculate there is a 1 in 10,000,000 chance an attacker is in a route to a chunk you want (based on network density) and may know the content. In such cases a salt will provide the additional change to the file to make it unknown. So this just changes possible known data to all unknown.

Really what I described. Much of security is not about impossible, but highly improbable or infeasible (like factoring large primes or guessing a private key).

If we are talking about snooping for known data then the best way to prevent it completely is not have known data and using an agreed salt type mechanism between folks is best. Or encrypt in an app with an agreed group key etc.

I think though digging in here as we move forward is going to be good as we can point to the code and detail exactly what happens. At the moment there are pretty fast moving changes in over 20 libs so we need to get to a beta release and really poke at these parts. But it will always be easier to counter such snooping with non publicly knowable data.

TL;DR Say we image such a snoop would mean target a single node (keep in mind these nodes change keys every session) and may mean several thousand computers being used to try and get to a hop close to that node in the route of a known peice of data, hoping it requests it. So we think OK this is a several millions or hundreds of millions of dollars effort. Then we need to prove it wa requested by that node (a different route) etc. etc. The end result here that with the resource a snooper could say this XOR address was sent a piece of data we know is bad. (using anonymous encryption, to defend the person sending it’s address). and so on

If we want to be even further away from being detected then we can salt all our data we share amongst folk, or use private sharing instead.

I don’t understand why you’re all focusing on finding the owner of the data on the protocol level. This is not the point of the “learn the remaining information attack”.

Point of “learn the remaining information attack” is that the set of possible plaintexts can be actually very small. The owner of the data can be known in advance. If one can derive content of the encrypted chunks only from the plaintext AND check if ciphertext exists on the SAFE network, then this attack is viable.

Does this confirm “check if ciphertext exists on the SAFE network” part?

Yes, at least once.
One of the nice things about new forum members here is they never seem to hesitate to rehash an old topic.

@neo Excuse my ignorance here but you reference “fix” which demonstrates something required fixing. Is a fix required? or is the fix in? Or was a fix never required?

Seems the Storj boyz have their own feelings about this.