Nothing major to report this week on the tech front, so we’ll keep it to a general progress report rather than the usual deep dive. More to report on the team front, however, where we’re happy to welcome a new team member. Mostafa is an experienced blockchain dev who hails from Malaysia.
Welcome to the team, Mostafa!
Hi. This is Mostafa. I am a software engineer. Nice to meet you.
Once when I was a student, I decided to visit an ancient fire temple (a building more than 2000 years old) which was located in a wild place. Finding the fire temple was tough, so I asked some local people to help me. Interestingly, they called it the “Devil House” and when I asked them why they said, “because humans can’t build such a thing, it must be the devil”. This was in my mind recently when I read the story of devil’s bridges in Europe. The bridges that were built centuries ago and still function. People call them devil’s bridges, because they can’t believe a human built it, it must be the devil.
No one knows who built the devil’s bridge or the fire temple, but they are still standing. Let’s build a devil network!
General progress
The SectionTree
work is now done
A reminder from @roland why we re-implemented some parts of the SectionTree
…
The
SectionTree
(previouslyNetworkPrefixMap
) is a data structure that encapsulates our current knowledge about the network. It can be thought of as a tree where each node is aSectionKey
signed by its parentSectionKey
, except for the root node (genesis_key
). The leaves of the tree also hold the section’s SAP while discarding the SAPs of the non-leaves.Since every
Client/Node
can have a different network view, theSectionTree
can vary vastly between the participants. Moreover, it is necessary to send parts of the tree out (SectionTreeUpdate
, which bundles the proof chain + SAP) to anyone that requests it and be confident that they do not mess up their tree while trying to update it.The
SecuredLinkedList
was previously used to construct theSectionTree
, but it had a couple of issues that led to incorrect insertions, and it was not CRDT compliant. Hence it was re-implemented as a CRDT (specificallyMerkleRegister
), and the oldSecuredLinkedList
was replaced with the newSectionsDAG
. Now we can be sure that it behaves as expected no matter the order or length of the update!
@roland has now moved onto distributed key generation (DKG) testing with @anselme. As explained over the last couple of weeks, there have been issues with the DKG process not always terminating. After some intensive testing, it’s looking pretty stable now, and @anselme is writing some documentation for the test process.
A note from Anselme on the upcoming DKG…
The upcoming DKG, more resilient, and without timers
The DKG process is currently an active process using timers that in cases of heavy network activity will often fail because of timeouts. We’ve recently been working on a new DKG that doesn’t make use of timers, instead allowing messages to be delayed as well as supporting heavy packet drops. When nodes don’t receive messages for a while, they gossip their knowledge to the others in the hope of getting them up to speed, or even getting updates from them if the sent information is out of date. This way, eventually every node will reach DKG termination.
We now also embrace concurrent DKGs with this new implementation. It’s now a race between DKGs and whichever set of candidates reaches the end before the others gets a chance at becoming the new set of elders. Some DKG sessions might be stalled or finish late, but we don’t consider them to be failures, they participated in the race and simply lost. We want the next set of elders to be as reliable as possible, so choosing the winner in this DKG race is also a way for us to pick the best nodes. Eventually, after section churn, these DKG sessions become outdated and as the candidates realise they lost the race, they know they can stop.
Another dynamic duo @bochaco and @chriso have been debugging the DBC tests. They’ve found a glitch whereby the SAP (list of current elders) is being removed by mistake during the client testing process, which destabilises the network.
Meanwhile @bochaco continues to work on the other major activity stream at the moment, streamlining the messaging processes and using single threads where possible.
@joshuef and @davidrusu are also working on aspects of message rationalisation, Josh is investigating extracting the comms
module out of node
code, to see if that might lend itself more neatly to multi-threading, while David is decoupling some of the code shared by the sn_node
and network_knowledge
crates, which will hopefully eliminate some of the node joining errors we’ve been seeing.
Useful Links
Feel free to reply below with links to translations of this dev update and moderators will add them here:
Russian ;
German ;
Spanish ;
French;
Bulgarian
As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!