Permanent data

TylerAbeoJordan · November 15, 2024, 11:14pm

Apologies, I missed your reply to my post.

The market demands payment … so little dips will drive greater volatility as losses incurred will further reduce node supply, driving the price for storage even higher and stalling out uploads even more … which then may further drive down the value of the token (as it represents the value offered by the network), and rinse-wash-repeat, we have our cascade collapse.

TylerAbeoJordan · November 15, 2024, 11:16pm

We need real time-based data that allows data to die and so keeps the network healthy under any circumstances.

DavidMc0 · November 16, 2024, 12:25am

I expect that node operators will create supply that is both elastic to price increases, and inelastic to price decreases.

I expect an elastic supply for price increases because it’s easy to spin up new nodes without increasing marginal cost, and the incentive to do so is clear.

I expect a relatively inelastic supply for price decreases because I think most node operators will 1) be thinking long term to a degree & will be aware of crypto volatility & cycles 2) will have an interest in maintaining the network’s integrity, and 3) because it takes more effort to take nodes offline than to leave then running.

Relevant to point 1) above, in times of lower demand, speculative node operators may choose to increase supply when token price is lower vs fiat to a larger degree than storage cost is lower vs fiat, because they may earn a bigger quantity of tokens per node, which will be worth more if the market for the token recovers.

If correct, all of this means that the network shouldn’t see a significant drop in nodes running, or the corresponding increase in store cost, if the only thing that changes is a short-term drop in the token price or demand for storage.

So as in my post above, I do think there’s a very real threat if there is a long-term downstrend in demand for storage, but hopefully this explains why I don’t expect short-term volatility of token price or demand for storage to be a likely to cause a ‘death spiral’.

I hope I’m right

TylerAbeoJordan · November 16, 2024, 12:39am

Node operators aren’t monolithic. While I agree that on the whole node operators are going to be flexible, and that there will be many who will keep the node running until the whole network collapses and keep it running even after … but for many too, they will be more sensitive to the value they receive in return.

What I have emphasized many times is that a run-away collapse only has to happen once. The fact that it can happen hasn’t really been disputed - it’s always been a question of “what are the odds” … Hence I’ve been arguing that we need to do all we can to realistically reduce those odds as everyone’s data is dependent on this point.

Permanent data is a sort of ponzi scheme that requires continued growth in data, nothing we know of grows forever, so the collapse is inevitable if we are being honest.

So the options are:

just to wait and perhaps we get 2 years, or maybe we get 20, and if we are really lucky we get 50 … I’m leaning toward the 2 year scenario.
add real time-based temporary data to the network that allows data to die so the network can adapt to data shortfalls by allowing reduced node count without causing the price for data storage to shoot the moon.

DavidMc0 · November 16, 2024, 1:06am

…except data.

I agree on this. I just think it’d be far more likely in the case of sustained demand reduction vs short term volatility… but I guess that’s kind of beside the point that if it’s a threat, mitigations need to be considered.

I don’t think that the fact there’s a legitimate threat leads to the conclusion that permanent storage is a ‘ponzi’ or that the only solution is binning the permanent data model. I think the model may be sustainable under realistic conditions, but it’s essential to consider mitigations, and there may be good ones that aren’t anywhere as extreme as you see it.

Archive nodes could help, and other options may exist that even in the worst case could avoid total death of the network, even if some damage is done.

TylerAbeoJordan · November 16, 2024, 1:56am

I thought you might say that - let me explain better - nothing non-ephemeral grows forever. Ideas, data are ephemeral. Once you make them real, or pin them to reality in some way - e.g. storing data on a hard drive, then that ephemeral thing is pinned to reality. Such pinned data cannot exist forever. deals with this by having many safegaurds - for instance having many copies of the data, but there are many threats to that pinned data and I believe it to be true that eventually that pin will fail for any particular bit of data and so also for all data on the network eventually - even if that threat is simply a more successful competitor to that will exist in the future. Perhaps some or even much of the pinned data get repinned on a new platform, but there are no guarantees there and certainly more cost would need to be imposed on some entity to support that.

Ponzi is just an analogy as I don’t know how else to describe a system that requires a constant feed of input or it collapses. Perhaps there is a more apt term for it? Networks are living things and as such must eat to live.

If we recognize and agree that there is ultimately no such animal as permanent data, then it’s really just “indefinite” data. So what I am trying to convey is that we need a method to allow the network to be more resilient to the problems that hosting indefinite data causes.

They may reduce costs further, but they don’t really solve the underlying issue.

Simply introducing time-based data, even alongside permanent data would solve the problem - as then nodes may refuse to accept additional permanent data if things get bad, thus driving the price to store permanent data much higher - but the network can continue to exist because it has temp data ongoing to maintain itself.

Additionally having time-payment based data opens up a huge market for us - so makes little sense for the project to ignore it as an option. There are some technical hurdles, but I don’t think they are large.

We don’t need this right away either, I just want to see it added to the roadmap.

DavidMc0 · November 16, 2024, 2:56am

I agree that it’s not urgent, but the reality of the threat of a ‘death spiral’ from sustained demand reduction warrants consideration of mitigations, and one of these could be adding temporary data.

This may also be true, and could be a good opportunity to diversify the network’s offering in the future once launch and early optimisation aren’t the focus.

It’ll be interesting to see how the future community organises itself in terms of governance to determine priorities for development once things are up & running… maybe a system of voting for polling node operators could help dev teams to keep a sense of what operators feel are the top priorities for development of the network… hmm sounds like an interesting Autonomi based project!

Traktion · November 16, 2024, 3:47am

‘Simply’ really doesnt sound simple at all to me.

When data is uploaded to the network, it is split into chunks. Some of those chunks may already exist on the network, some may not.

To reconstitute a file, it needs access to all yhe chunks. If some of those chunks disappear, due to time limited storage, then other files are impacted.

Moreover, the network doesn’t know which chunks belong to which file. It is a one way relationship - a datamap derives which chunks are associated with a file, but there isnt a way to derive which files are associated with a chunk.

This provides anonymity and plausible deniability to node operators. They have no way of knowing which files they are storing. They just see a bunch of disassociated chunks.

Moreover, this allows deduplication at the chunk level. If two different people upload files with overlapping content, there is a probability that they will share chunks.

If you start adding timers to chunk storage, you’re opening a whole can of worms, which impacts the security and scalability of the network in a multitude of ways.

riddim · November 16, 2024, 3:50am

Isn’t it nice that Moores law will enable us to not worry about it and just to have permanent data? =)

riddim · November 16, 2024, 3:56am

It always was permanent data & I don’t think there is any data indicating that’s not possible in our world out there… no matter how often (the same) people stress that they can’t believe it will work out there really is nothing that supports their fears (and it would come with many down sides)

Traktion · November 16, 2024, 4:01am

Perhaps another way to let the network breath with demand, is to allow the number of chunk replicas to be somewhat flexible.

Storing more copies when storage is cheap, gives a buffer to store fewer copies when storage expensive (in both meanings of the word).

Another mechanism could be to keep a tally on how frequently a chunk is accessed. If data must be culled for network survival, culling the least accessed data may be the least harmful. Or, if replicas quantity could be chunk based, least used chunks could be replicated less (and perhaps naturally be drawn towards the most stable nodes as result).

Ultimately though, a network which isn’t being used, will eventually start to decay. Whether data is temporary or permanent, if folks aren’t uploading, then node operators won’t be profiting.

This also provides a strong argument for a bedrock of spare capacity providing storage needs. These devices are unlikely to respond quickly to market forces, as it is a very low cost overhead in the first place, where any income may be better than none.

TylerAbeoJordan · November 16, 2024, 5:36am

Nope. I don’t know how you are doing the math there, but none of that is the hard part. I will explain how it can all be done another time though as I’m busy the rest of the night.

Nope. The cascade collapse can happen regardless as it’s based on demand, not the inherent costs of storage.

We mustn’t forget that isn’t and won’t be the only game in town. There are now and will be many more alternatives for data storage. We need to leverage tech, but also be flexible to market demand.

DavidMc0 · November 16, 2024, 5:41am

That’s why I expect permanent storage will be feasible and sustainable.

But, that doesn’t prevent a possible ‘death spiral’ scenario caused by a drop in nodes pushing store cost up to a level where demand is lower, reducing rewards, causing more nodes to exit etc etc.

Adding mitigations to this at a later stage, perhaps along the lines of the ideas @Traktion shared seem pragmatic, even if they’re never activated in practice.

riddim · November 16, 2024, 6:53am

Where should that level be? As Dimitar showed people are uploading pics for 5usd via Blockchain … Such a sum makes a cloud node run for close to a year I’d think … And nodes at home are cheaper… So because of the different cost levels there will always be nodes running at a profit

This here…

riddim · November 16, 2024, 7:07am

If this happens there is something terribly wrong with network (economics) - and it’s not the possible x5 for more/fewer replica… And not the x6 for permanent storage… If we reach the price level of no upload demand then the network is useless and/or way too expensive to run for farmers…

Traktion · November 16, 2024, 7:55am

Well, chunks aren’t associated with a user. Once they’re on the network, they’re just data to be maintained.

Even private data is the same. It just happens to be encrypted so that it only makes sense to the user with the private keys who uploaded it.

If you want billing cycles to allow these chunks to be sacrificed if a particular user doesn’t pay, you’re going to have to link them back to a user some how. You then start losing anonymity and start adding complexity.

But sure, please link your prior ideas or state them here when you have time. Then we can see how simple it is.

Traktion · November 16, 2024, 7:58am

Another idea would be to mark some chunks with a temporary flag, which could be a cheaper one off fee. Unless someone also uploads a permanent chunk with the same content, these temporary chunks could be the first to be sacrificed.

However, even this does lead to a network with 404s and potentially missing data. If it is simply about network survival, it could be an option though.

neo · November 16, 2024, 8:32am

This was suggested ages ago, and I’m sure forgotten in time, but yes you could have the flag salt the encryption as well so there would not be a dedup situation. Someone wants a permanent copy of that chunk then its paid for and uploaded normally.

The meta data on the flagged chunk would have the flag with an date field and so any node holding he chunk can feel free to remove it at their leisure any time after and replication would not replace the chunk on loss of it after that date/time.

But that would require meta data for each chunk to be held. More complexity as you say.

Traktion · November 16, 2024, 9:15am

Interesting re the salt. I suppose you lose dedupe into order to isolate the chunk, but it could mean the uploading user could just ‘bump’ the temporary chunk periodically with a payment. The user wouldnt need to be identified either then.

As you say, it would need metadata to track expiry/renewal for these temporary chunks though.

I could also imagine that the cost could end up higher for the user. We would also lose the immutability in general and all the advantages that brings.

neo · November 16, 2024, 9:31am

Only for temp data. One would expect temp data to be mostly unique anyhow, but not guaranteed. So the loss of dedup for the temp chunks would seem to be minor. The meta data required would also require a whole new storage part to the network. Not a new data type but adding to the definition of a record. Every record needs it since it has to be universal.

Topic		Replies	Views
Pay once store forever Community	133	1418	September 12, 2024
Sacrificial data vs non-permanent data Autonomi Network Token (incl (e)MAID)	27	3617	August 19, 2015
SAFE Storage economics - one-time fee, forever service Features	91	7107	August 6, 2016
How Autonomi is different from other technology? Updates	60	1241	January 25, 2025
Immutability - does it change everything? Features	53	2111	March 5, 2019

Permanent data

Related topics