How does SAFE protect whistleblowers?

dirvine · July 26, 2014, 10:48pm

You will read this is not the Routing API then, this is a normal kademlia API. Our kadmelia like implementation is Vaults (for store get etc) and Routing for lookups. If you read the papers and or code you will see this is not the API. The systemdocs state very clearly why kademlia or any implementation is not good enough and required a complete rethink. Please go check it out.

The user DOES NOT send a STORE direct to a storing node or data manager (please read how the storage part works here) Documentation · maidsafe-archive/MaidSafe-Vault Wiki · GitHub , it is via a close node at the moment.

In terms of subverting FIND_NODE calls etc. see send/ack implementation, we parallel 4 messages and route around any node that cannot prove it sent the message one step further via a system of 5 messages to prove this. You can see how routing works here Documentation · maidsafe-archive/MaidSafe-Routing Wiki · GitHub

EDIT [ TL;DR, what you are talking about is how you imagine routing and vaults are implemented and using normal kademlia to explain it, this is not normal kademlia and attempting to imagine what we have done will produce weird results, if you read what we have done then it will be much clearer]

anon86652309 · July 26, 2014, 10:54pm

Good You’ve made me re-read the doc in question, and it’s unfortunately out-of-date too. We really struggle with documentation and communicating the current plans outside the Troon hub. As you’ve probably noticed, @dirvine does a mountain of work on this front, but we need to have more than just him. There was talk recently of taking on some technical authors to address this problem, but typically, because it’s not directly related to writing code, I haven’t kept up to speed with how that’s progressing.

Anyway, as for this specific case, your time wasn’t completely wasted. Much of the doc is still valid, but I’ll focus on the section about the lookup. To quote the paper:

E. Recursive lookups
It is very important in distributed computing not to hold state on remote actions. This is because remote actions are just that, remote and therefore out of your control. Kademlia handles this well with iterative searches carried out in a loosely parallel fashion as described in II.

With managed connections, however, there is a different situation as we are working with a very current network of nodes who are all in communication. In such a case a recursive lookup may prove significantly faster and also with much less network traffic.

This recursive lookup can now be a single message to the closest node in the routing table, who recursively passes on to their closest node and so on. On any failure the recursion would continue from the previous node to the failure, who has an open RPC that will fail to the failed node and can easily select the next closest node.

On finding a node or value the requester is passed the contact tuple of the node in question from the last node in the chain (not the actual node who has the answer) and then continues with normal kademlia logic, which may involve getting the κ closest nodes in a find node situation or simply getting a value in the get value situation. Caching and last node requests in addition to caching (in future) can then also cache the value in a find value request and do so without being requested.

Much of this is correct, but there are a couple of critical differences between this and what we now have implemented.

At this level (the Routing layer), there is no longer any notion of values. All values are stored and retrieved via the Vault overlay’s protocols, which is probably where much of the confusion in this thread has stemmed. Routing simply provides a method of delivering messages from one node to either a single peer, or a close group of peers.

The other significant divergence is that the last quoted paragraph is mostly wrong. Regardless of whether a message is destined for a single node or a close group, no contact tuples are passed around. As far as I’m aware, we use NodeIDs (more or less the hash of the Client’s or Vault’s public key) exclusively in Routing, with the single exception of the case where 2 peers wish to connect to each other.

anon86652309 · July 26, 2014, 10:58pm

We need to update the docs! Should we pull down the PDFs until we can get time to recheck them, or until the technical authors start (if that’s still going ahead?)

dirvine · July 26, 2014, 11:50pm

Agreed, we can take some doc update time soon enough. We need to point everyone to systemdocs for now.

willish · July 27, 2014, 7:41am

Yeah this was just my interpretation of your maidsafe-dht. i.e. Kademlia with the extra GIVE_VALUE operation. This was gleaned from your dht paper pdf. I guess that paper is no longer valid

I also watched the video of David’s presentation at Seattle Conference

What I understood from this was that, yes chunks were not stored on the DHT, but pointers were. So you would lookup a chunk X using the DHT and you would eventually get a value which was actully a pointer to a vault in the Vault layer.

I guess that is also no longer valid?

Aaaaaah so the DHT is never used to store any information of any kind? This could explain A LOT of my confusion!

So your DHT is simply used for the FIND_NODES style operation, in order to send direct and group messages?

So its not really a DHT after all

willish · July 27, 2014, 7:48am

If you are looking for technical writers, coming at this from a fresh perspective, one suggestion might be to post a bounty to write about X , Y etc

In addition one could try and visualize the important parts of the system. Animations/Videos could help enormously.

I’d also be pretty keen to help out

dirvine · July 27, 2014, 9:55am

Yes we badly need some help there as you have found out with this paper for sure (sorry about that I never realised that was there). The most recent docs are the systemdocs on the website, then the dev wiki, then the papers, then the patents. So there is no lack of content, just the opposite, there is too much, plus the code of course.

We have been asking for a wee while now for authors to help out, we don’t have the time in house so this would be great. Bounties may also be good, but there has not really been many authors with experience coming forward. I know Nick was chasing down a company hee, but if you want to give it a go we would certainly take you up on that for sure. DHT would not be a good start just now though as it is way too big.

We have a pivotal tracker task to update the self encryption document which we took off line till it was redone (same reasons as dht).

qi_ma · July 27, 2014, 10:41am

Yes we need some updates in the docs.
Though they give general feelings of our system,
some detailed sections have been changed during the implementation.

Topic		Replies	Views
Is MaidSafe safe for whistleblowers? Features	2	727	July 25, 2014
Malicious Nodes Beginners	3	1000	June 18, 2018
MaidSafe and tracking of bandwidth transferred Development	11	1689	February 13, 2015
Truly Anonymous Beginners	13	1097	September 21, 2017
Hi. I was wondering how maidsafe encrypts public data Features	11	1865	October 15, 2017

How does SAFE protect whistleblowers?

Related topics