On scalable, decentralised applications (MaidSafe article)

Hi all!

I want to share something with you guys. It is still very rough, very early work, but I am now confident that there is an early foundation that makes sense. It is probably also for a niche audience (those interested in theoretical frameworks).

But this is the official SAFE network forum, where all the news is reported first; and we have a good tradition of developing everything out in the open.

Final warning, this has been a side-project for the past few weeks (I’ve been wanting to do it for a long time), and progress on such matters goes really very slow; also my policy for accepting contributions will be very different from accepting code contributions, so anything will start with a discussion first.

I started writing a formal article that attempts to take the minimal distance needed from the code implementation of the SAFE network, and start to construct a formal language to think, formulate and prove theorems that will go to support the implementation of the SAFE network.

This is intended as a ‘symbiotic effort’, I strongly believe to first imagine and build - as MaidSafe has been doing very successfully; in a second step we try to formalise and improve. There is no guarantee we will be able to complete this formalisation either; the SAFE network is an emergent network and as such, you can almost argue it is proven to be unprovable; but we can try :smile:

So here it goes, it might be a heavy read - and again, far from finished, not ready for publication, probably to be completely rewritten, etc:

hope you like it


Woow, makes my brain spin :wink: I’ll read it again. What I missed in the terminology is the explanation of “closeness”. There’s talk about nodes being close, but it’s not made clear that this is XOR-based.

edit: or maybe that’s on purpose, that it could work for different forms of closeness.

1 Like

hi @polpolrene,

great question !

yes within the SAFE network there are at least several clear reasons why we want to prefer XOR distance over any other distance measure (that I know of), but so far I’ve not needed it in these arguments, so I deliberately kept the distance measure unspecified, just that there is a distance measure on the address space.

This shows that this article is a parallel effort to the actual code. They don’t have to be the same, but the idea is that each can learn from the other.


Hi Ben,

Glad to see you surface!

This looks like it is probably very significant. Only problem is that I don’t know what it means. Could you give a laymen’s language thumbnail of what this is directed at accomplishing. I just don’t quite connect.

Not asking to sort out the math, which will probably always be beyond me, but what it’s solving.


Hi John,

The aim is to extract the protocol from MaidSafe; the deeper lessons learnt in a new language that allows to reason with it. If successful it should cover all aspects of the decentralised consensus mechanism. With that it should be the foundational article for a new class of scalable, decentralised, autonomous applications on the internet.


Wow! Thanks. …


Ben, really interesting initiative to formalise the workings of the SAFE network! I can imagine if you can put the whole theory down in a formally correct way you can easily spot any loopholes or weaknesses in implementation. Great to better understand how it works on a fundamental level as well :smiley:.

I have been reading a bit through the document, some questions/things that aren’t entirely clear:

  • Proof to corollary 4.1 is not trivial to me. How is m always an element of n|m? There can be g+1 other nodes in the table of m that are all closer to n than m is? Proof is found in the fact that closest nodes are selected from union of the table of m (which can be empty) and m self (which is one node)? What is the use case of this however? We have a close group which contains node m, but we already arrived at node m?
  • Definition 5: For effective routing, why is it necessary that n(i+1) be an element of the close group of n(i)? Why wouldn’t n(i+1) to be closest element to b of the entire table of n(i) not be faster and equally as effective?
  • Definition 8: “and the set is closely connected”: is this a requirement or an implication?
  • Definition 8, example 2: Can it be proven that if A and B are both effective close groups of a target n that there are cases that n doesn’t closely connect them (intuitively I’d say that n could be the bridge between A and B, but there might be situations where this doesn’t hold)?

Learning a lot from this! Looking forward to the document to evolve further.


The proof is wrong ! :slight_smile: great input already; so yes you’ve already suggested the correct proof T_m U m is never empty so there always is a closest node to n; hence n|m is not empty

!! much appreciated, great feedback already. The use of 4.1 is limited to knowing such a set is never an empty set. It might be clear that the discussion is interested in sets m|m for a node m, ie "our close group (as seen by ourselves)

The reason I do first introduce definition 4 is to make it clear that “a close group” is always seen by a node, and

  1. does not exist for a data element, eg “m|n” “the close group of node m as seen by data n” is not defined.
  2. the close group “n|m” (n data, m node) does not need to be correct/sensible if m is very far away from n; for a far away node m, the close group of n as seen by m can be highly distorted.

So that’s why I explicitly introduced the concept “as seen by node” for the words “close group”


So I’ll admit that little effort has gone in to make it pedagogical; I already need all my mental focus on being consistent, on target and hopefully correct :smile:
My initial topic of interest here is not on the full aspects of routing; two pages down I caved in and put an unhelpful comment:

We will limit our considerations for long distance message delivery
through the connected network as this research field is extensively
studied and usable results already exist. We refer to the references
and appendix for a brief discussion, but assume the reader is aware of
common techniques.

So thus far the aim is on establishing the conditions needed on the “tables of nodes” to ensure the logic for local consensus group algorithms holds. The way I saw it was that as we progress towards the end, all the conditions on the tables needed, for efficient local group consensus, as well as full network scale message delivery are combined and such an object would be a “routing table” close to the object as we know it in the code.

So to answer your question: a path (definition 2) puts no such restriction, but a “close path” and a closely connected set" have this additional condition to ensure that some of the following lemmas trivially follow.

I only went back and defined these concepts of close path and closely connected set after I convinced myself that lemma 1 and lemma 2 are not provable without this condition, and hence defintion 8 on the effective close group is non-sensical (without theorem 1)


yes, with the limited powers of my brain, this is a requirement :), but that is not a bad thing. It tells us what the conditions are we need to assert in any implementation. Note that for the effective close group to not be closely connected would be a contradition in terms as well.

note: if you would traverse back through the git history you will see I first had no such condition, then defined a “connected set”, made that the condition, made some progress, had to revert and also introduce a closely connected set, tried to make progress, had to introduce it as a condition on the effective close group. :slight_smile:


Note that for an arbitrary target name n, n can be a node or n can be a piece of data; in the latter case n doesn’t form connections and disconnected sets A and B can both consider n close without introducing any bridge (as there is no node n to imply a bridge); but good point actually: should n be a node then a bridge could be possible

although it can still be possible for the node n to be connected, but not closely connected to A and/or B, and as such not be part of its own effective groups, nor form a bridge…


@Dirk83 pushed the correction to corollary 4.1 and added you to the acknowledgements


Thanks, that explains :smile:. A special interest in m|m is clear in the discussion. Understand the use case of such a set to include node m now (let’s say we have only one node in the network (if that doesn’t give other complications) named m and the target n, then if we look we for the closest node we find ourselves :smiley:).


Didn’t realize target can be a piece of data as well; thanks, A and B to be effective close groups for target n, but n not closely connected to both of them was the counterexample I was looking for :).

Haha, too much honour for a little correction but fun to be mentioned :smiley: .