Content discovery inside Autonomi

Seneca · May 4, 2025, 6:45am

Allow me to step back from DNS a little bit to general content discovery, since my comments on that is what triggered this thread. Content discovery requires some kind of collaborative data structuring if we don’t want to rely on centralised parties/authorities, with all the potential power abuse that can come with it. I’ll try to explain why I think it won’t work well with the current Autonomi datatype restrictions:

Hahstable restrictions
Fundamentally, the network is a hashtable (dictionary), allowing for key/value storage and retrieval where the key is the network address. But currently it has restrictions on what keys are accepted; the key either has to be a valid cryptographic public key (because the value under that key has to be signed), or the key has to be the hash of the value (in the case of chunks). There is no freedom to store an arbitrary value under an arbitrary key.

Public keypairs
To me this seems like a major obstacle in public collaborative structuring of data on the network, like a conversational thread, which I will use as an example. The best method to achieving this with the current restrictions (that I’m aware of) is to use GraphEntries with shared or public signing keys, but it has a serious potential sabotage problem. The basic idea is that a person wishing to start a public discussion thread generates a BLS keypair for sharing, creates a GraphEntry, adds the public key of the keypair as a Descendant entry as the address of the next reply, and stores the signing key of that pair as Descendant metadata. This way anyone reading the GraphEntry knows the keypair and can upload a new GraphEntry under this shared public key.

Sabotage
But this falls apart if there’s one actor that wants to sabotage the discussion. That actor can simply upload a new GraphEntry using the shared keypair that doesn’t provide any new shared keypair descendant in it, effectively locking the thread because no further replies are possible (or only by the attacker(s), which can be even more nefarious). Even if you inititially add multiple descendants, even with keys that are not shared so you can initially circumvent the lock and make a new reply yourself, it doesn’t stop the attacker from doing it again the moment you switch back to using a publicly shared signing key.

So to me it seems that only whitelisted group discussions where every participant is trusted to not sabotage are viable under the current restrictions. The root problem is that the potential location(s) for replies have to be specified explicitly in advance, and once these are used without providing new potential locations, the overall data structure becomes immutable and the discussion cannot continue.

Mutable types
Using a mutable datatype like ScratchPad instead of immutable GraphEntry doesn’t solve the problem. At best it becomes a back and forth game between good actors and bad actors editing their previous uploads to circumvent the sabotage, where good actors have to constantly adjust their previous replies to route the reply addresses around the attacker’s sabotage.

Implicit addressing
However, if we had the ability to store data under arbitrary addresses, app builders can define protocols where the addresses of replies are derived implicitly (in contrast to specified explicitly in advance). A basic example would be that the first reply to address X will use “address = hash(X+1)”, the second "address = hash(X+2), and so on (any prefix or suffix can be added to differentiate protocols/namespaces, create branches, etc). This means that there’s a practically infite amount of potential reply locations, making the aforementioned attack impossible. App protocols can just skip and ignore any reply that doesn’t conform to its schema definitions, or that are flagged by moderators.

Node targeting
One of the reasons that I’ve seen mentioned on these forums not to allow this is because arbitrary addresses for data would allow different kinds of attacks on the network, as it would allow specific nodes to be targeted to store particular data.

I would argue that the current datatypes where a public key is used as an address is just as vulnerable as a datatype where the address would the hash of a field with an arbitrary value. Just like hashes can be “mined” (bruteforced) to find one that ends up at a particular node, the same can be done with public keys. The algorithmic complexity of generating hashes and generating public keys is both linear. Even if the public key derivation algorithm is a bit heavier to run, the network could use a heavier hash function or require rehashing X amount of times to derive the address.

DNS and lookup speeds
The other arguments I’ve seen are related to the typical “DNS” usecase, where people tend to think of the current restrictions as stopping global “DNS” usernames and website addresses (that can be squated and lost) from becoming dominant. I generally agree with those concerns, but stopping the network from functioning as a general purpose hashtable (dictionary) has a lot of collatoral damage.

Generally speaking, we can never have arbitrary collections of data that provide Θ(1) search time complexity on the network level; they would have to be converted to a hashtable in local memory first, which means downloading the entire collection. This is resource intensive because it has to be stored in local memory and it would have to be redownloaded every time the app that uses it is restarted.

At best arbitrary collections are organised and uploaded to the network in some kind of search tree structure that provides Θ(log(n)) average search time complexity. Which is not too bad, but organising such data structures collabarively (for example for distributed DNS) runs into the same problem as the locking sabotage attack I explained initially. So either every user organises and uploads such structures for themselves (extra upload and compute costs per user), or they rely on another party to faithfully do this for them (a centralising force).

Maybe I have overlooked some great solution; I really hope so. But if not, to me these restrictions on the network’s hashtable really don’t seem to be worth it. The main concern appears to be a fear for a global DNS system with negative aspects like squating, but removing the aforementioned restrictions does not make it inevitable that such a system will dominate. This community seems pretty aware of the risks there, and won’t naively implement such a system I believe. As long as Autonomi doesn’t champion it either, I think it’s best to just let the different designs compete.

Topic		Replies	Views
Hello again. How's it going? Apps	3	159	July 6, 2024
Project Decorum Update Apps project-decorum	12	448	April 25, 2025
Messaging on the Autonomi Network - Developer Feedback Needed Apps development	65	1055	February 20, 2025
Ways for community to spread the information Marketing	0	165	May 15, 2024
Welcome to the Autonomi Forum Beginners	0	40052	April 18, 2014

Content discovery inside Autonomi

Related topics