It’s great to have got to the stage that client side at least we can check into the forum, ask for logs when something isn’t working and fix it on the fly. Case in point, the problem with missing chunks when downloading large files. In that case it was a matter of batching chunks when downloading.
In fact it’s been a good week all round for fixes, many to do with our internal testing and integration, but some more visible too. One that’s just gone in deals with a problem where clients were overpaying for uploads. Every time a repayment was required, the client (currently) repays all the nodes in the close group rather than just the one with the higher rate. So while we need to improve this (to not overpay), nodes were actually just discarding the free money they were being sent if they already held the data. Now they take any tokens sent to them, before checking if it’s actually enough (the tokens transfer cannot be reversed anyway).
On investigating a client log from @Toivo, which showed the client looping on chunk uploads, Qi found the cause to be lost network connections and routing tables. It may be best to simply terminate the client in this case and get them to start again.
It’s also been a great week for external collaboration. We had a productive meeting with Max Inden of IPFS who is one of the main libp2p
Rust guys, explaining how we use Kademlia/Libp2p, which is quite different to how IPFS uses it. In IPFS it’s about tracking who is storing the data rather than managing the data itself. As such it’s designed for small data transfers, whereas Safe is designed for large ones, which is why we have our own replication mechanism. Anyway, it was a good meeting of minds and we look forward to continued collaboration and seeing how we can contribute upstream, AutoNat being a prime target.
Thanks to everyone who’s been spotting anomalies and sending logs. Much appreciated. Sometimes it’s hard to reproduce errors our end, so they can be super helpful. Josh, Qi and Roland have been sifting through them and they helped us fix a number of glitches this week.
We’ve tracked down the root cause of the main memory leak to nodes “dialling back” when they notice one is lost, or apparently behind a NAT. This process does not die as it should, so we’ve removed some of the code around that and seen a decent drop in memory. We’re tracking the effect of doing that now, as well as watching to see if there are other areas we could improve here.
HeapNet2 is proving to be a sturdy beast. We appreciated @aatonnomicc’s comment “the quiet in hear is deafening means we are finding less and less to moan about lol”
Now that chunk uploads and downloads seem stable, we can focus on registers, paying the Foundation and optimising replication.
We hope you’ve noticed the guys have been out and about on the forum this week. If you see anything amiss just ping us. Next testnet should be around royalty payments, we’re just working through new setup to verify things there.
General progress
@anselme has been making payments faster and more efficient, by turning CashNotes (heavy) to transfers (light) as soon as possible in the code to avoid dealing with big CashNotes that must be read from disk. This results in a smaller memory footprint as transfers are much lighter than CashNotes, and much less disk I/O.
He also fixed a potential possible double spend flaw, by making any split an error, with the exception of registers for which, as CRDTs, splits are handled client side.
@roland has opened a PR to allow users to configure the Open Metrics server port. He’s also looking at self-healing via queries and potential caching angles.
@chriso has created a new crate called sn_releases
to expose a repository for downloading and extracting our release binaries. He also made a PR to a rust crate for managing system services. This work will open the door to a tool for managing nodes (to allow for automated updates).
Fresh from his victory in tackling the Mystery of the Missing Chunks on download and cracking the Client Loop Conundrum, @qi_ma is now sleuthing a similar enigma: Chunkless Node Syndrome. He has some interesting leads around PeerId
bias, which may well explain some uneven node / chunk / reward distributions.
@bochaco has pretty much completed network royalty payments including validating the amounts paid per node address. This is ready now at least as the first step for having net royalties payment made by clients, verified by nodes, and notifications sent by nodes through GossipSub. All nodes are by default subscribed to the GossipSub topic but only the royalties payment notifications are sent over the topic.
@joshuef has been looking at how many nodes need to be paid for holding a chunk. In theory it could only be one, which has the huge benefit of simplicity, but could be risky if the node goes down before replication can happen. @bzee is running tests on that scenario. Josh and @dirvine are also working through how to identify and eject/ignore bad nodes.
Useful Links
Feel free to reply below with links to translations of this dev update and moderators will add them here:
Russian ;
German ;
Spanish ;
French;
Bulgarian
As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!