Safe Network Dev Update - February 11, 2021

maidsafe · February 11, 2021, 5:41pm

Summary

Here are some of the main things to highlight since the last dev update:

The testnet release is still on hold as we work to complete some message flow refactoring, which is blocking rewards from progressing.
Message flow refactoring work is making good progress, with a draft PR in place. This will result in a cleaner, simpler, and more efficient message flow.
We’ve migrated our testnet deployment/take down scripts to use terraform, resulting in a drastic improvement in time taken to create testnets of any size for internal testing/external deployment.
Spending some time working with a no-rewards setup has allowed us to catch and squash some bugs that would have otherwise remained hidden until rewards were fully implemented.
A new $ safe networks set subcommand is being implemented in the CLI which will allow users to more easily connect to networks by simply using their bootstrapping IP:port address, with the corresponding PR going through review now.
We believe we’ve come up with a solution for section chain forking in sn_routing. This solution is currently being implemented, and we believe it will help make testnets stable enough to cope with community probing.
Community code contributions keep on coming in!

Testnet status - on hold

Again, we were aiming to include rewards in a public testnet this week, but some related work to adjust the message flows after we made the switch to only do message accumulation in sn_routing, which affects all parts of sn_node, has turned out to be more time consuming than we anticipated. This work currently blocks rewards from progressing.

Earlier in the week we decided to focus some energy on the alternative of having a testnet with no rewards, i.e. stripping out some of the rewards flow, something we made a start on last week. We made some good progress here, but hit several blocking issues along the way, which we have had to apply several “hacks” to temporarily resolve until rewards and the related functionality are in place. The resulting no-rewards networks that we have been spinning up have lacked stability, reliability and consistency, so as it stands we don’t believe there is value in putting a testnet up without rewards. This alternative approach did have some benefits this week in that it allowed us to somewhat progress with multi-section testing, which led to us identifying and fixing a few issues which we would have otherwise not seen until rewards and the related functionality were in place.

We still expect to host a testnet asap, with focus shifting back to move forward with rewards again and release a reliable testnet with that in place. Potentially we will do a little more testing using the no-rewards work to see if we can discover anything else lurking for us down the line.

Testnet prep and testing

Towards the end of last week, we were attempting to publish large testnets, and it quickly became apparent that our bash script for this was not up to the job - taking 30mins to launch 20 nodes or so, and taking a hell of a lot longer to launch 100 nodes! As such, we’ve been migrating across to terraform for managing droplet and node deployment/destruction, and that is muuuch better. We can now launch 40 odd nodes in a few minutes. We’ve been using this pretty heavily to iterate, and have it set up now to allow us to deploy custom node builds too. Which has proved very handy on the iteration front. The PR for this switch to terraform is in place and pretty thoroughly tested now, with some tidy up work intended before merging.

At one stage through the week we were fairly regularly seeing internal testnets wanting to split, but failing to do so. We started trying to debug with smaller sections (for example, 3 elders, a 5 node section size) to trigger more splits, but this didn’t help. It turned out that we were not seeing splits as our code was depending on Elders moving sections, but this is not actually required (as things stand). As with all things probability, even those unlikely events seemed to happen reasonably often…and so it was that all our elders were falling in one half of the section, and so forming a new section unto themselves, with no key-change needed, and none of our code being hit.

With more bug fixes in place there, along with removing some rewards functionality that is being reworked, we’ve squashed an issue that was occurring at network startup whereby on every churn in the genesis section, the newly promoted Elder would re-propose a genesis payout once again, effectively making the rest of the Elders await another genesis payout that was supposed to happen only once initially. With that nailed, we’ve also squashed another related bug where accumulation of the genesis payment proof wasn’t happening (we were storing all payment events except the validation ones); which helped get things moving.

After that, we also came across some looping in sn_routing, which was observed to cause some high memory usage, potentially causing nodes to die. We know where the issue is, and we’ve put a temporary measure in place to prevent it from happening for now. A more permanent fix will come in due course.

With all the above rolled out with the no-rewards branch, we finally got to the stage where we could see the majority of client tests passing, with the fails highlighting a few other issues such as occasionally hitting some code we shouldn’t be able to reach (some error handling required for that), and now we’re currently whittling down an issue with client wallets too. Getting the client test suite just a little bit greener in preparation for when the rewards flow is fully integrated again.

Safe Client, Nodes and qp2p

Safe Network Transfers Project Plan
Safe Client Project Plan
Safe Network Node Project Plan

QP2P

Over the last week, we have tested the new qp2p API with all our crates. There were some issues initially but those are all ironed out now and we are in the final review and merge phase across the board. We will be including these changes during end-to-end testing and they will be a part of the next testnet release.

Messages

Recently we moved over to accumulate messages in sn_routing only, from previously having also done this in sn_node. To finish this refactor, a lot of code has been touched, but also about 1350 lines removed.
The result is a simpler, cleaner, and more efficient message flow.

Work to do this is currently ongoing, with a draft PR tracking progress. This will help us move forward with the message flows in general, but very importantly now also the rewards, which have been held back a bit by these updates.

API and CLI

A technical debt we had in our sn_api crate was to make our Error type/s implement std::error::Error trait. This is something we completed and merged this past week with the help of the thiserror crate. We’ve also changed CLI codebase to make use of anyhow so all functions now return anyhow::Result and error handling is made much easier without losing information or context about the root cause for each of the errors propagated.

A new $ safe networks set subcommand is being implemented in the CLI which will allow users to more easily connect to networks by simply using their bootstrapping IP:port address. The corresponding PR includes an update to the User Guide, so for anyone interested in providing some early feedback about this command please go ahead and take a look at the description here.

Community contributions kept coming this past week. There is a work in progress effort by @bzee to make the nodejs package compatible with latest version of sn_api, as well as a fix in the CLI to remove a flag name that was causing a conflict between two different commands (https://github.com/maidsafe/sn_api/pull/708). A PR was also raised and merged to remove logging implementation from sn_client as this should be left to the applications or binaries that use the library, making sure applications do not get unexpected output on stdout or stderr.

Since the majority of dependencies of our crates use tiny-keccak v2.0.2, @mav has been sending PRs to update all our crates to depend on this same version. We all very much appreciate the effort from everyone who gets involved in whatever way they can

BRB - Byzantine Reliable Broadcast

We did some additional work on brb_node_qp2p to get it working with bi-directional streams and the new (coming soon) qp2p API. This enables each node to send and receive from the same port, instead of opening new connections over a separate random port for outgoing packets. In the process, we contributed a couple of small PRs to qp2p. One in particular makes it easier to share a qp2p Endpoint between threads, which should be a win for anyone building a p2p app with the library.

sn_data_types

We have a design to simplify the Policy/Permissions logic governing access to network data, this is currently going through internal review. We also have a PoC for a new Chain CRDT which might prove to be a better underlying CRDT for our Sequence data type, this came out of the issue @mav raised concerning our Sequence data type.

Routing

Project Plan

This week we were exploring a promising approach to fork resolution. To recapitulate: we have something called a “section chain” which is a list of section keys linked together with signatures. It can be used to prove that a piece of data was signed by a group of nodes that were once valid members of a section, even after those nodes are long gone. Currently this chain requires that the section agrees on which key to append to it next. If there is a disagreement on that, the chain can fork into two (or more) mutually incompatible chains which could currently break the section. This can happen, for example, at times of intense churn. We were hoping we could get away without tackling this for a bit longer, i.e. until the testnet was out, but it turns out we are sometimes seeing forks even in relatively small test networks.

So we were busy discussing how best to attack this problem and we came up with a couple of promising ideas. One such idea is currently being implemented which we hope will help make testnets stable enough to cope with community probing. There are still some potential concerns about security and possible attack vectors, but those will be addressed later. Right now the focus is stability. Baby steps.

Useful Links

Feel free to reply below with links to translations of this dev update and moderators will add them here:
Russian ; German ; Spanish ; French ; Bulgarian

As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!

Knosis · February 11, 2021, 5:41pm

First! Thanks for all the hardwork!!

hamarana · February 11, 2021, 5:52pm

second
… pretty close this time

davidpbrown · February 11, 2021, 5:55pm

Prompted a thought whether there is ever a case that part of the network would be disconnected from the rest but still large enough to be stable… some power brownout for example - is the limit of what that could be stable known; would it still identify as the same network?.. and if it rejoined, would that difference in data be a collision that would merge over time?..

Anyhow keep at it… a few days more, round the next bend - are we nearly there yet from the back seat doesn’t help… but testing will have to count as holiday this year!

Sotros25 · February 11, 2021, 6:14pm

Here’s a link to the tweet sharing this update, so you can show your support by liking, retweeting, and (especially) commenting!

Almost there, folks

wydileie · February 11, 2021, 7:02pm

Great update. Would it be possible to get copies of the terraform and presumably Ansible scripts that are used to deploy the nodes and software onto them? Are they on github somewhere? Once the testnet is up, I have some funds available on digital ocean that I can use to put up a few nodes, so not having to write those myself (and seeing the resource requirements you guys are using) would be helpful.

TylerAbeoJordan · February 11, 2021, 7:36pm

This Terraform stuff sounds pretty cool. Weird thought crossed my mind that’s probably ridiculous … but, if (and it’s not currently possible obviously), but IF there was a cloud provider that accepted safe tokens, then could the network itself auto-deploy farms on demand!?

Thanks to Maidsafe team and others who are working to make the impossible network a reality. Every week sees it become more and more possible.

Cheers

dirvine · February 11, 2021, 7:59pm

I am sure we could do. I Am not sure if they contain some keys etc. but we can check.

goindeep · February 11, 2021, 8:48pm

Aight bad.

Nigel · February 11, 2021, 9:40pm

Is this you? Cool ankle bracelet bro. Jk

Awesome update! Keep trucking

goindeep · February 11, 2021, 9:48pm

yiah aight bad

Secretariat415 · February 11, 2021, 10:34pm

Thanks so much to the entire Maidsafe team for all of your hard work!

bridge · February 12, 2021, 5:09am

Dont rush team, you are building networks that will last for decades or hundreds of years or even more.

The crypto is hot these days because of fake liquidity fluctuations. And price increase is very important for all holders. But rational holder never think there’s no chance if Maidsafe doesn’t rise in price during this bull market. In addition, they already have a portfolio that balances blockchain tokens: BTC and Maidsafe.

So, dont rush again and only consentrate on the network freedom for mankind. God bless team!

jeremyjpj0916 · February 12, 2021, 6:58am

The testnet release is still on hold as we work to complete some message flow refactoring, which is blocking rewards from progressing.

Are we confident enough yet in remaining concrete changes that we can apply timelines to the first test net deliverable? Maybe 3-4 weeks max based on the seeming backlog and bug crushing?

Cryptoskeptic · February 12, 2021, 7:44am

The answer is always NO. You should know that by now

dirvine · February 12, 2021, 9:16am

True, usually, here I do think we are looking weeks or less. There is one area in rewards really that is unclear to me personally, I feel we are good. Let’s see though

Traktion · February 12, 2021, 9:18am

Thanks for all your hard work @maidsafe !

I know it can be frustrating when you (and everyone else) can see the finish line, but it feels like someone keeps moving it a few feet further away. I hope everyone understands this isn’t like tailoring a framework or pattern to customise it - this is green field in the true sense, in theory, design and implementation. It is hard. It is uncertain. It will be worth it!

For my part, I’ve read ‘the book’ for Rust, along with the Async one. Then I thought, where better than to familiarise myself with Rust than the Safe Network source code? So, I’ve been looking through sn_cli, sn_api, sn_client and now I’m starting to scratch the surface of sn_transfers and sn_node.

What are my thoughts so far? It is great code. It is easy to read and reason about, nicely decomposed into structs and functions. Coming from a mostly Java background (of late), I was interested to see how OO concepts are represented in Rust and I’m glad too see it works well!

Reading the code helped me to understand some of the details and appreciate just how slick the interface for app developers will be, even at the Rust level. Considering the complexity going on under the hood, this is a massive achievement! I’ve started thinking about how I’m going to implement some of my own apps (for learning and hopefully useful too!).

So, my deep dive into the innards of sn_node will continue and with each hour I spend looking through the code, my appreciation for the team and the product grows. This isn’t some thrown together ball of mud. This is careful and considered development. It is self evident that it is being built to last. The devs have a lot to be proud of and I can’t wait to see how it will change the world!

JayBird · February 12, 2021, 9:38am

Fabulous stuff @Traktion, on many levels. Thrilling to see more people diving in, looking forward very much to you throwing out concepts and designs for your own stuff, and don’t hesitate to fill us in on your journey through the code!

19eddyjohn75 · February 12, 2021, 10:03am

Thx for your incredible hard work Maidsafe devs.

It’s great that others are digging more into the code (that’s how bugs get found eventually). @bzee @mav

Time and time again I’ve tried to push devs that I know to do a PR or look into the SAFE Network code, unfortunately no luck. They are still on this tech

Hang in there super ants eventually we’ll have a testnet with rewards, if you can code please chip in…

Topic		Replies	Views
Safe Network Dev Update - January 14, 2021 Updates	54	3626	January 21, 2021
Safe Network Dev Update - February 4, 2021 Updates	57	3937	February 11, 2021
Safe Network Dev Update - November 26, 2020 Updates	67	3794	December 3, 2020
Safe Network Dev Update - December 17, 2020 Updates	194	8297	January 14, 2021
MaidSafe Dev Update :safe: 26th April 2016 Updates	27	5310	April 27, 2016