We’ve been busy this week patching up the testnet for relaunch. One cause of the errors we saw last time was replication going to too many nodes, causing increased memory usage. In turn this caused rapid price changes meaning clients got stuck in a re-spending loop.
We’ve put in a couple of PRs to stop these excessive replications now. We’re testing them internally, and further debugging potential issues.
Tooling improvements are under way to help us track royalties across testnets and get baseline metrics to compare against. And we’re looking at more subtle testing techniques with concrete goals than simply lobbing up a couple of gig to see what happens.
We are also looking into a simplified payments regime proposed by @ansleme.
The goal is to simplify payments and remove the need for retries and repayments on price changes. The signed quote provides a verified price agreement between the client and node.
In summary it looks like this
- Client asks node for price quote, node replies with price + timestamp + signature
- Client gathers total payment amount including royalties
- Client sends payment to node along with the signed price quote
- Node verifies signature is valid and timestamp is within last 10 minutes (for example)
- If valid, node stores data
The key points are:
- Node signing removes the need for repayment in case the price changes between the quote and payment as the node knows it agreed to this quote as it created it.
- Timestamp allows node to reject excessively old quotes (as now).
- This reduces the need for retries and cached payments in the client (as long as the client is trying to upload within the contract expiry time).
- Only a single payment and royalties transfer instead of multiple.
The timestamp here is a contract between the node and itself to make sure the clients aren’t mispaying. It is NOT a contract between the client and the node.
Thanks again to everyone who has helped out with testnets and testing. We hope to be able to bring you a longer-lived testnet iteration soon.
General progress
As well as the above payments proposal, @anselme has continued to refine the pay-one-node setup, which requires a bit of refactoring.
@Roland has been tinkering with the testnet deployer so we can pull out more useful info when launching testnets, including getting the genesis multiaddress and node stats. He also created a PR for the deployer to add resiliency to Ansible tasks, so it will ignore errors deploying up to 10% of nodes. In testing, a 2,000-node network with unreachable nodes was still able to deploy successfully using the ignore_unreachable
option.
Roland also fixed an issue with client timeouts which was identified by @loziniak
@bochaco has completed a change that means that the royalty fee now amounts to 15% of storage payments instead of the one-nano-per-address approach we’ve been using. We also check the total amount received with the notifications and not just the number of notifications. Anselme has updated the wallet software accordingly.
@Qi_ma has split royalty payments across topics to reduce resource usage, and has been working on tests to assess its impact.
@bzee looked at how to gracefully stop the nodes with signals in addition to RPC and is looking at libp2p
to see if there are things to take into account for logging off, rather than using killall
or Ctrl-C
. He and other team members are also reading up on libp2p
Sybil protection.
And @joshuef has been hard at work putting an end to those pesky excessive replications.
Useful Links
Feel free to reply below with links to translations of this dev update and moderators will add them here:
Russian ;
German ;
Spanish ;
French;
Bulgarian
As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!