Well, it was bound to happen some time. After a run of successes, the RoyaltiesPaymentNet was cursed by high memory usage which killed off many nodes before they could even start, and left the rest pretty much zombified. The spooky thing is it all worked fine on our internal testnets (albeit with some slightly raised memory levels). Could poor RoyaltiesPaymentNet have been struck down by dark forces beyond our ken?
Or perhaps there is a logical explanation. Chief in our sights is GossipSub
, the system by which nodes performing transactions propagate the fact to foundation nodes which then take their share. GossipSub
is dealing with many more messages than anticipated. It’s not yet clear if that’s looping, or client top-ups resending royalty payments, or something else.
One issue is that all nodes try to decode all transfers, causing a lot of unnecessary activity, another is that libp2p
has been allocating quite generously… We’ve some PRs in to help there and are hopeful this will yet come together!
There are some other fixes to go in too, including libp2p
fixes, encrypted transfers, and replication on put changes which should reduce load when we launch another testnet.
We’re grateful that the libp2p
team is responsive and open to helping us. This week @dirvine contacted them about building in Sybil defences based on some recent research, and they’ve said they’re open to the idea.
General progress
@roland has been looking into chunk splitting and the payments process, and also added a new feature to the CLI that ensures the user has enough balance before executing an action like an upload.
@chriso worked on the node management side of things. Windows is always more difficult in this regard and he ran into some issues, but it’s mostly sorted now.
@joshuef investigated high memory usage and looping messages in GossipSub
which may have caused the testnet failure, as well as other small fixes, and is looking to implement pay one node which should speed up the validation process and improve performance.
@bochaco created a PR to refactor the transfer validation to make it more efficient, and has also been the main driver of implementing encrypted royalties transfers. Tests are now working.
We’ve been experiencing a few payment failures in testing as we move to only paying one node. @anselme is digging into those, and working to make the issue easier to debug.
@qi_ma has been fixing some other internal tests that were failing.
And @bzee has also been working on pay one node, while additionally offering up some improvements to the API query Kad workflows.
Useful Links
Feel free to reply below with links to translations of this dev update and moderators will add them here:
Russian ; German ; Spanish ; French; Bulgarian
As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!