Summary
Here are some of the main things to highlight since the last dev update:
- Happy New Year!
- We are excited to announce that the Dynamic Membership work has been completed and we have today published several Byzantine Reliable Broadcast (BRB) crates on GitHub
. All are linked from the central brb crate.
- Weāve been working hard, even over the holidays, to improve our errors and comms layer, including creating a new
sn_messaging
crate. - We are in the process of preparing some minor fixes for a new patch release of the CLI.
- We are also close to having the CLI and
authd
built with musl, meaning compatibility with a larger range of platforms. - We have a new stress testing tool for routing with the PR currently under review. This new tool is designed to discover the limits of routing in terms of how it handles membership changes (churn) and has already brought some issues to our attention.
Testnet Update
Thanks to everyone who has taken time to try out the testnet code released before Christmas. With all of the MaidSafe team now back at their desks, we are continuing to work through some issues we identified before release, and others you highlighted for us. Once we are satisfied that these issues have been resolved we will announce another iteration, with the intention being that we will host a public testnet for everyone who can, to connect to.
Safe Client, Nodes and qp2p
Safe Network Transfers Project Plan
Safe Client Project Plan
Safe Network Node Project Plan
Weāve been working to improve our Error story throughout our libs, and have transitioned to using thiserror
throughout node/data_types/client/transfers to provide a better error chain and some greater encapsulation of functionality. Previously, we were using a lot of mixed errors, pulling a lot from sn_data_types
into other libs. Now we have specific errors in each lib, for that lib, and only propagate errors from lower libs as another version of the current libās errors.
On top of this, weāve extracted sn_messaging
from sn_data_types
into its own crate in order to separate out our comms layer, as well as errors that weāll be sending to/from other nodes and clients. This is a small step towards more clearly defining a network āAPIā of messages and errors and it provides cleaner separation of errors from internal libraries to the client.
As part of this effort, we are exploring different serialisation types, with the end goal of having one which is programming language agnostic. We are at the moment focusing on a simple JSON serialisation (as opposed to currently used bincode
), but also playing around with Msgpack.
The knock on effect of all this has been some cleaner code, and much clearer error flows throughout all the involved libs, which is great.
In tandem, weāve been removing client āchallengesā from the node/client bootstrap flow. These were previously used to verify a client was holding keys, in order to prevent message replay attacks. But with idempotency coming from AT2 and CRDT data types, this will be handled there. Yet more simplification for both the client and the node, and further clarifies network operations as signed messages only.
Previously, to prevent key-selling attacks on the network, we removed all SecretKey exposing APIs from sn_routing
and contained them only within their crate. However, we found that there were multiple complications down the dependency tree caused by this removal, and so agreed to bring back those APIs to allow us to move ahead quickly with the testnet during the holidays. We have decided to tackle this problem head-on right away, and have started refactoring the sn_transfers
and sn_node
crates, where we hold and use those SecretKeys outside of sn_routing
.
The signature-aggregate work carried out by sn_node
during exchanging messages among KeySection and DataSection is possibly duplicated with routingās consensus accumulation work, as both are actually being undertaken by elder nodes. We are investigating and carrying out some refactoring work trying to remove this part from the sn_node
crate, and to trust the consensus messages from sn_routing
.
And a final wee bit of work is underway removing stream
management from nodes. This was put in place to maintain comms with clients, but with recent qp2p
changes we can rely on connection pooling there to handle this for us, and so remove a lot of complexity from the nodeās client handling. We are also in the process of refactoring the qp2p
examples into separate parts to demonstrate the echo_service and messaging systems clearly and distinctly. We are doing trial runs with these examples with manual port forwarding to potentially support routers not compatible with IGD in further testnets.
API and CLI
We have been focusing on changes and improvements on the network side, however, we have still been working to take care of some minor bugs that have been reported by the community while using the testnet and so are in the process of preparing some minor fixes for a new patch release of the CLI.
Also, we are trying to get our next release of CLI and authd
to be built with musl, which as we know will allow us to run these applications on many more platforms using the same released artifacts. We were able to build them manually already (thanks a lot to @mav and @tfa for their input and contributions to this), so we are now looking to get this into our CI in the coming days.
BRB - Byzantine Reliable Broadcast
We are excited to announce that the Dynamic Membership work has been completed and we have today published several Byzantine Reliable Broadcast (BRB) crates on GitHub . All are linked from the central brb crate.
The BRB system consists of:
1. The core BRB broadcast protocol for members of a quorum to replicate data in BFT fashion.
2. The dynamic membership protocol for nodes to dynamically join and leave an active quorum.
3. Data type wrappers that encapsulate compatible data types (e.g. CRDTs) for transfer via BRB.
4. Comprehensive tests to verify correctness.
5. brb_node_qp2p: an example CLI app/node for manually invoking BRB functionality.
For those interested in digging into the details, slides are available and provide further insight into the system and protocols.
Routing
As we all know, relocation is good for the network, facilitating node ageing amongst other things. However, we observed that in some situations we were over relocating. For example, we were relocating even when we did not have enough elders due to churn, and also relocating nodes when they had newly joined. To resolve, we set up some criteria to avoid over relocation aiming to keep the network stable during certain scenarios.
An API change was also undertaken to return section info for specified target name. This is mainly for sn_node
usage for upcoming refactoring work.
We put together a stress test for routing (PR under review). Itās a little tool designed to discover the limits of routing in terms of how it handles membership changes (churn). It generates random churn according to a configurable schedule. It then periodically outputs various useful information about the network, and measures the network health. This tool will be very useful for the upcoming work of integrating the new dynamic membership solution. Running it on the current version of routing, it already discovered some issues we have around relocations and splits, which we will look into soon. This is actually good, because the first step of fixing a problem is knowing about the problem Here is a little screenshot of the toolās output:
Useful Links
Feel free to reply below with links to translations of this dev update and moderators will add them here:
Russian ;
German ;
Spanish ;
French ;
Bulgarian
As an open source project, weāre always looking for feedback, comments and community contributions - so donāt be shy, join in and letās create the Safe Network together!