Just a wee note @happybeing but good to get your feedback. I have an internal discussion now on replacing registers and preparing for generic transactions. All logic will be client side for these, so hopefully max flexibility and better ease of use (no more understanding crdt trees and DAGs etc). I should say though app devs should still use one time use keys here to make this work better for them. As we use BLS there you can derive any key from a root key, so that becomes simple but efficient.
Hereās a preview (itās my usual very short to the point stuff here).
Summary
This proposal introduces a simplified, flexible transaction structure for the Safe Network, replacing the existing register-based CRDT system. The new system removes constraints on data growth, reduces complexity in the codebase, and shifts transaction logic to the client side, thereby lightening the load on network nodes.
Motivation
The current Safe Network architecture employs CRDT trees for registers to manage mutable data. However, this approach presents limitations:
Registers have a growth cap, leading to scalability issues.
Transactions are tied to specific data types with hardcoded node rules, increasing complexity.
Network nodes bear the computational burden of managing transaction logic.
By introducing a generic Transaction type, we aim to:
Simplify node implementation.
Eliminate the growth constraints of registers.
Enable more diverse use cases through flexible client-side transaction validation.
Reduce the network load by decentralising transaction logic to clients.
Detailed Design
Proposed Structure
struct Transaction {
owner: PubKey,
parent: Vec<PubKey>
content: [u8; 32], // 32-byte array
outputs: Vec<(PubKey, content)>, // allows currencies or fork type data behaviour (if necessary) Will usually be a single entry
signature: Sig // signs the above 4 fields with the owners key
}
Key Features
Client-Side Logic:
Validation and rules are implemented exclusively on the client side.
Nodes only handle basic validation and storage.
Generic Use Cases:
Currency: Transaction chains can track ownership and validate output balances.
Filesystem Directories: Chains resolve forks, with clients choosing the branch to extend.
Metadata Linking: The content field can point to network-stored chunks, enabling rich metadata references.
Examples
Currency
Client Check:
Ensure parent transaction is unique and matches the child as an output.
Verify outputs of the parent are non-negative.
Trace transaction lineage back to a genesis block or sufficient depth.
Filesystem Directory
Client Check:
Identify forks as nodes with multiple children.
Follow both branches and merge or resolve conflicts client-side.
Use content to point to metadata for directories.
Advantages
Simplicity:
The transaction type is universal and straightforward.
Removes the need for complex register-specific logic.
Scalability:
No inherent size limitations in transaction chains.
Flexibility:
Decentralised logic allows applications to define custom rules and behaviours.
Supports diverse use cases beyond the limitations of current registers.
Performance:
Offloads computational overhead from nodes to clients.
Reduces network traffic and processing.
Drawbacks
Increased responsibility on clients to implement robust validation logic.
Potential for misuse or inefficiency in client-side implementations if not well-designed. This is true for any client app though
Alternatives Considered
Enhancing Current Registers:
Addressing growth limitations within the existing CRDT structure.
This approach would still retain much of the complexity in node rules.
Hybrid Models:
Combining CRDTs and transactions.
Adds complexity without eliminating existing limitations.
wow - this is quite a significant change ā¦ but might take a lot of work off the network and to the client side
the fun aspect would be ā¦ if this works reliably for data ā¦ there is no need to distrust it for coin TXs ā¦ and with creating and using this structure weād validate the Token system xD ā¦
if I understand correctly the public key of the TX would be the network address
so because the retrieved TX would be unique it would require the use of 1-time keys - right?
aaaand it would require to know the child already when creating the TX (and if the private key to the child is lost the chain ends at this point forever xD ā¦)
the good thing about this approach is its flexibility ā¦ just today I was thinking about how to utilize the network as a time series database ā¦ with registers itās somewhat cumbersome ā¦ with TXs I would ācreate a new time storeā with:
basetime snapshots
higher time intervals that get snapshotted (for performance increase)
starting TX
first aggregated TX (for each upper level timeslot)
and every time step would contain a pointer to a dump of a collection of timepoints + the pointer to the following timestep ā¦ like that I could write 1min / 5 min intervals as batch and e.g. hourly data to improve performance ā¦ most certainly I would prefer to write the timestep data into a single chunk and not self-encrypted split into many pieces because it will be rather small (compared to 4MB)
ā¦a somewhat nice way to store and retrieve time series data without hosting a database myself ā¦ (but hard to not see the chains it creates throughout the network ā¦ hard to claim weāre chainless with this chain of TXs anyway xD I guess itās not everything bad about chains! )
@dirvine please can you include something of the API thatās being provided either in your post above or via a link to relevant Rust docs? It should help us understand this from a developer perspective.
Iām just about to publish an API lib of my own which gives versioning functionality for any data type (structs, collections etc) based on Registers I was so pleased with it as itās my first use of Traits and a templated type for the History.
Yes thatās right. You can have forks in the same location but it owed not make sense to use the same key. We would provide a next method so it creates a new key for you. So you alway supply your master key pair and add more items at any time. Those will spread across the network.
I feel this is the key, we donāt force rules but let clients come up with any scheme they wish.
Yes, thatās true, but it decentralises the chain storage. So unlike a blockchain or whatever, here you have millions and millions of chains of pointers or values. So you should be able to create almost any app/currency we want and test them out
Itās not in code yet, but should be very quick there. Itās basically the type as shows with a few helper methods for adding to the chain. Forks or no forks can all be enforced client side.
Will this new feature still allow public as well as private transaction chains?
Iām assuming that knowing where the head is of a busy public chain could be challenging, if forks are to be avoided, i.e. knowing which transaction to append to.
Tbh, transactions sound like how Iāve been trying (and having issues with) registers. The limitless chains size also removes a big limitation there.
Yes, my first thoughts are weāre losing a lot of capability and replacing one set of client-side complexity with something at least as complex for less usefulness. I get that something had to be done to handle the size issue, but there are other ways to do that.
Big win seems to be simplifying nodes, but Iām not sure it reduces network traffic - we canāt really know that kind of impact until we see what apps have to do to achieve similar functionality.
Yes, just by encrypting the link you can be private. Same with output child (only holders of decryption capability knows actually where itās pointing).
Yes, like registers, however I am not sure many folk wills start at the start, if that makes sense?
i.e. you post some data in a web site, link to another source of data (a transaction type) and you always want to make sure thatās always the same data. However you can also see itās updated, if you wish. Then more sites do more links later on and the same things happens, but they may choose to share a later version.
So I am not sure how many use cases will want the whole chain, but itās like a dot matrix printer really (in how fast this stuff can be). (i.e. if we Get a large file we are reading thousands of chunks and decrypting etc. so a chain of thousands would not be an issue IMO). Getting thousands of these per second should be possible with the network operating properly. Well clients should be able to send/receive thousands of messages per seconds and say, 30-40 messages per data time, I should be fairly fast to go down the chain.
It wonāt reduce traffic to read the whole chain for sure. But for individual elements itās likely faster, but I donāt think that part is an issue to worry about as much. Itās more reduce complexity and increase flexibility.
This will be fine for many applications, but something that is frequently updated will die pretty quickly if say, a social media app has to read the entire chain every time it is installed on a new device.
A way around that is to have centralised points to pick up later points in the chain (similar to peer discovery) and that seems like a really poor consequence.
Iām also concerned about the point @Traktion raises about discovering the ācorrectā (concensus) head of a chain with branches and donāt understand your response to that. CRDTs were a brilliant solution to that. I guess clients could still use them in some way, though not sure how, but it would be a shame for this capability not to be a feature of Autonomiās API.
So, you could see a chain of transaction, but not necessarily who each owner was.
Presumably, in token context, multiple outputs could allow the āsplittingā of a token amount, with some going back to the original owner (or their new pseudonym address) too?
Sounds a bit like a blockchain in that regard. However, there would be lots of independent chains, for each genesis token. Pretty cool.
Presumably, āwalletsā to track which transactions belong to you would save raking through transactions from genesis too?
Yes, understood. Many use cases would be more concerned with the recent branches. File systems, forum posts, anything where trees are useful.
However, Iām wondering about how to ensure there is only one child, to form a public list, e.g. sequential versions of something, that anyone can update. These cases I donāt want a tree and branches would likely be pruned/ignored (by the client app).
If I query the head (i.e. latest tx), append a tx to it, but someone else does the same thing at the same time, presumably it is up to the client app to resolve this? Iād like to understand the mechanics of public list appends, really.
just another entry same location pointing to the head? xD ā¦ could result in an ugly mess at one point ā¦ (and since chunks are stored close to their address that particular point of the network would get somewhat clogged - right ā¦? possible local flooding attack?)
Knowing the latest value of a list is really useful. Having to navigate the whole chain will quickly get boring/slow.
Maybe registers were trying to do too many things. Having a complimentary type that is a register in the traditional sense could be handy - a single/current value, with no history. Used like a variable.
It could be used as a āpointerā to a transaction on the transaction list/tree.
Could be a common/shared version or private ones. Common/shared could be abused, but a rescan would overcome.
Edit: btw, I used HEAD in the git context, rather than linked list parlance. To be clear, getting the TAIL or last/most recent node is my concern.
Sounds like a plan here. No part is the best part someone is known to have said. Removing the work from the nodes is so much more Ant behaviour.
You mention āchainsā. I am assuming this is just following the parent pointers to traverse the chain in reverse.
While it adds data to each transaction, could there be a use of having the āOriginā transaction so that an APP can see the origin without having to load down the network retrieving each and every transaction in the chain to get to the origin transaction?
Yes Rob, folk can use a chain of transactions. Myself I am looking at ways to share parts of neural networks So no chain there, just a transaction.
With registers this would be the same, so a register might hold a few hundred entries, but then it needs to split. With a CRDT tree you cannot split it so it grows forever, that would not work for us as it could grow more than 32 Gb of whatever nodes should hold.
Itās app defined, but just like a CRDT DAG, so you can have forks (more than 1 entry exists for a key) and you follow the fork to get the longest one or whatever your app decides is the branch to follow, if that makes sense.
Yes, you can do that for sure.
There is a lot of ways to do that, but an approach I like is randomised parent checking. So for a currency you have a rule, any duplicate transaction exists and that one is dead (a no fork rule). Then instead of going all way back to genesis you can go back to known points, or use a complexity algorithm to go back a certain amount of parents depending on the complexity of each transaction (i.e. many inputs is complex, a single input is simple).
Yes for sure there would. For currencies that would be a component and I am sure devs could have Audit nodes that just continually traverse back to genesis in that way. A bit like our DAG node concept.
This sounds like a great advance for nodes and so the network as a whole.
Is this going to be a pre-launch or post-launch change? And if pre-launch, will it set us back time-wise any or can it be whipped up fairly quickly and tested?
Cheers and nice out of the box thinking @dirvine !
It should be very quick to get in place and the client API updated (as it has to anyway). I imagine we mark registers as deprecated and force use of something like this instead.
David, I probably wonāt dive deeper at this point so just a couple of comments.
This isnāt really the answer to my point - youāre saying in effect that Registers also turned out not to be scalable for this kind of use and that Transactions doesnāt solve that issue for many anticipated use cases. Social media being a big one, but anything that can grow over time that is around for a long time or has many āauthorsā is going to be difficult or perhaps unachievable.
I realise this is in some ways much better than Registers, having found them problematic for months myself, but it appears to leave a key use case for them unsupported. Decorum for example. Anything where history is important over long periods risks becoming impractical to maintain in a history. So again we can end up with either centralised solutions or a decentralised web that forgets data by fragmenting in some way (even if the data itself remains stored on nodes, indefinitely). Thatās my concern anyway.
Transactions, the name
Unrelated, Iād like to suggest a change in name for this type because Transactions are an application level concept. The name implies certain use cases that will get in the way of discovery and understanding for developers who can use this to record a sequential tree of anything.
I suggest a name that fits the underlying properties but isnāt already over loaded with meaning (or with misleading meaning anyway). Naming is hard!
At first I thought āStepā, rather than Link or ChainLink which are already used heavily but might still be worth considering. Maybe thereās a better term.
For my API I came up with History and would be content to let you adopt that for an API which creates a history for given data-types based on Transactions. The point is that you can have a history of anything so I chose Trove (collection of valuable stuff) for the underlying unit and so have:
Trove a trivial trait you implement on anything you wish to create and manage a history for.
TroveHistory<T> as a templated type so you can create and find entries using a templated API for your type.
FileTree, a ready to use struct for storing a tree of files or a website, for which TroveHistory<FileTree> provides the API for managing its history, or similarly for any developer defined type.
Details are in docs.rs as of yesterday (here). Not usable by third parties yet but working in awe. It uses Registers for but will be easy to refactor for Transactions when they appear, or if you wish to include a History style API directly in the autonomi crate and steal the terminology Iāll be fine with that. If so, I would though like you to keep the Trove trait as so that the TroveHistory<T> can specify the type using the first element in the chain.
Thinking about the long list, finding most recent tx, issueā¦ perhaps it comes down to two things:
App owner or moderator periodically updates the app config to point set the best root/tip tx.
App user periodically does the same, either locally or to autonomi.
1 would be ideal for new app users who donāt want to traverse from root to tip. It requires an element of trust, but app owner/moderator is there for that.
2 would be ideal for the untrusting new user or frequent user who can set their own root/tip anyway.
As both of these would mean updating another tx list, it would need to be done less frequently to be worthwhile.
To optimise, nesting tx lists may be needed. This could allow an app root list to be most infrequently updated (as it needs to persist for a long time/indefinitely). Child tx lists could be used to prune old tx lists when they get too long.
Moreover, nested lists are essentially branchesā¦ so maybe instead of multiple lists, the above can be simplified with multiple branch depths.
It would take a bit of figuring out how to be optimal, but at least history is always retained. An alternative single value register/variable to point to root/tip would lose that.