Sad to say, but that is a and is the essence of the cascade collapse scenario I’ve voiced concerns over many times in recent months here on the forum. So long as we have decent growth as well as continually lowering costs, then it all works fine … but the market isn’t consistent, it has ups and downs … so data storage isn’t consistent, it too has ups and downs … so we have to be resilient against periods of low growth. I’m far from convinced that we can be so resilient if we are only relying upon permanent data.
I suspect a blockchain is the wrong data structure for storing large amounts of data. That alone could make it very expensive. If there is no deduplication, even more so.
As Autonomi uses chunks and deduplication, along with a fast non-blockchain based p2p network to access them, it’s potentially a much more efficient beast.
Yet here you are, producing yet more data. Me too!
Data just keeps on coming, year after year. It tends to get bigger and bigger too, as more and more stuff gets analysed and stored.
Moreover, storage continues to decreace in price, decade after decade, with no sign of this halting.
Autonomi reflects that reality in its economics.
100% with your statement. The fact that we feel the need for network emissions is a warning sign for me. The argument is that this is needed in the “unstable and shaky” times of network start and reduces over time. However an extended period of low/none influx of new data even in a very mature network will lead to a cascaing collapse if node operators act rationally (Irony: maybe there is hope that they don’t).
I understand that argument, but it seems you assume we get 100% share of that market. We are one (!) actor in the storage market. I have yet to see a business which is constantly growing, year after year without fail.
I guess time will tell and like I said earlier, I really hope to be proven wrong with my concerns!
I think that is what it boils down to.
Is it the best design for a managed decline in new data uploads? No. There will be an intersect (Z) where X amount of new data needs to be added to sustain Y amount of data already persisted.
However, as long as there is a steady rate of new data, of at least Z, then we are good.
If the network enters terminal decline, there is an argument that it will have been superseded anyway. Data will likely be migrating to whatever that new system is. That’s ok too.
What I would like to see from the team is temporary data added as a feature on the road map. That would give confidence to many potential hodlers that the project can take on more stable and less risky nature.
Further I don’t think this is a big ask either and if it is, I’d like @dirvine to explain how it is. I’m not asking to dump permanent data, only that real temp data be added to the plan going forward.
Such would dramatically increase the potential of .
If it works out that permanent data is cheaper due to deduplication, would you still hold this position?
In my view, there’s no clear argument for why they are necessary or would be effective. Reasoning given has been vague notions of thinking it might help if incentives from incoming uploads is too low to support nodes. I don’t think emissions are necessary assuming growth in line with organic demand for network resources, which I feel is the best way to do things… but at least emissions are now pretty small vs what was originally planned, so any negative distortion from them should be minimal, and if they do end up helping, then great
I guess if operators expect the situation will change and demand will tick up again, they will carry on through the tough times, earning more tokens per GB stored due to low token price at that time, so rationally behaving operators who are optimistic about the future of the network would do well to keep the network healthy by adding capacity if needed.
There are certainly a lot of complex dynamics that can’t easily be predicted.
I definitely agree that the whole system will be more resilient if the network is offering more than just permanent data, so I hope that happens long term. The more revenue sources to nodes the better, and the more choice to network customers the better.
Yes, absolutely. Dedup will affect the costs and so the price in a very beneficial way - particularly for some types of data, but it won’t change the decisions of data managers who are going to stick to pay as you go systems as those are the known entity.
Markets are conservative, we need to offer a product that they demand.
I have high hopes for dedup -it’s a key tech for , but for much of the unique future data it’s not a game changer - especially with our large chunk size now. It’ll be great for movies and film - things people watch collectively and store, but isn’t helpful at all for things like astronomical data from telescopes that need huge amounts of storage and the data is all unique. Companies too will likely be generating mostly unique data.
I see no reason that we need to be exclusive to permanent data. There are some technical reasons why it will take more time to develop temp-data on , but all we need to do is put it on the road map and work on it when there is time and funds - it’s not impossible. Putting it on the road map would help to stabilize the market for
token as it gives hope that a solution will be developed and open the gates to a much wider marketplace for data.
It feels like it depends how much the annual cost is vs perpetual cost. Do we know how many years before the annual cost grows larger than the initial perpetual cost?
Ofc, data managers aren’t going to choose crazy new tech, but a fresh angle may help. If it is just monthlies vs google/amazon, that’s a tough argument to make to the purse holders.
I agree the large chunk size doesn’t help. I’m assuming when we go to the native token, this can be reduced though. We already use smaller chunks for smaller files, so the network itself must be able to handle it. It’s just a question of what is allowed to be persisted.
We are about to get scratchpad data types, which only store a current value and abandon history. That should cater for a lot of the shorter term data requirements. I’m assuming the cost model for these will be cheaper too.
Then there is data that is ephemeral and really only for storage by the peers consuming it. For this sort of data, maybe even a scratchpad is too heavy and it doesn’t really need to be stored on the network at all.
So, I suspect we have a category of data which is 1 month to 24 months where perpetual could be too expensive and the alternatives are too short.
I do wonder how much would fall into that category. The telescope data is likely forever data and paying rent on it is likely to be costly long term. Same for a lot of data gathered for analytics - long term, perpetual, storage would be ideal.
Maybe we would miss out on a chunk of the market that wants something in between, but I think a lot of that may come down to cost. Competing directly with AWS, on similar/worse terms, is likely a road to a hiding… at least initially.
Would there be two prices for storage? Would nodes get to choose between each? Could one end up subsidising the other? Would having both prevent the spiral collapse you have mentioned or would one drag the other down anyway?
Would apps have to support both storage options, to expect dead chunks? Would that mean potentially losing conversations in forums or twitter clones, as the author didn’t want to pay for their storage any longer? Would we need way-back-machine all over again?
If it adds a useful feature to the network, I wouldn’t be against it. However, it needs careful consideration on the user and dev impact. It would also add complexity to the network and its apps. It reverses a core tenet of the network - that data will be there for as long as the network - and shouldn’t be entered into lightly, imo.
I don’t think that’s bulk storage - it’s app data storage? I’m also not sure that it’s temp space - more like rewritable space? So still have to pay as though permanent space? I obviously don’t know the deets here, just my gut. Perhaps you know more about this and can explain?
I’m not certain about that either - They are already using AI to analyze and while it may be nice to keep it forever and do better analysis in the future, the cost to store could be enormous -especially if permanent data. We are not talking small data here (chuckles to himself) … telescope data is going to be big data into the future. If it can be held for a few years to be analyzed by various AI models, and then discarded, that may be the better deal.
I don’t think we could fully compete there in terms of latency, but for big data we are about equal and we can emphasize privacy, security and censorship resistance.
100% … temp data would have to charge a different rate.
I presume client would submit a temp data store request (perhaps with an expiry date) and node would send a quote for that. Client then pays and upload begins. Many deets left out of course.
perhaps, but that would be bad … I hope we could shape incentives to avoid such.
Not prevent, but lessen the risk. Having a temp data market means more income. It also means that during bad times when there is little data coming into the network and nodes drop off, the network can hold on for longer as old data is expiring, so less need for nodes. The issue is with full permanent data, then when enough nodes drop off during bad times, the data may be lost. Temp data gives some breathing room. Think of a fat person and a skinny person during a famine … one survives the other doesn’t.
Apps would chose what they want to use. So could be either situation.
I think it simply adds to it. It’s not like permanent data goes away. Adding temp data gives flexibility to developers, clients, and nodes. It’s a win all the way around as far as I can see.
how many these project are simply decentralized cloud, and who can also display websites as autonomi?
the cost is so high because arweave is out of space. To earn with arweave, you currently have to supply the network with 45 x 4tb drives (+ sas controllers +enclosures etc).
Every arweave node stores all data. It’s useless.
I’m not sure about that … I asked Gemini which gives dubious responses about such things, but it indicated that Arweave uses a system of variable redundancy - the larger the network the greater the redundancy (I suppose there are limits on that) … but I think such might be a mechanism to deal with the very issue of cascade collapse in a permanent data network as redundancy will decrease automatically if nodes start disappearing – assuming that this notion is true at all - again I don’t trust the answers Gemini gave me.
Hmm… and I guess that requirement would also keep going up in line with demand? Doesn’t sound quite as advanced as Autonomi’s automatic but limited replication of data.
It certainly seems like Arweave is the closest competition for now, and that Autonomi has a good chance of out-performing once the network is up and running with the API ready to go… and even more so when the native token is ready to be used.
According to this website: Quick Guide to Permanent Storage on Arweave | Community Labs Blog
Arweave nodes are incentivised to store as much of the Blockweave (structure that holds Arweave’s stored data) as possible to maximise chances of mining. That sounds like something that won’t scale well without some serious management of what is replicated how many times.
It’s cool though that it seems easy to make a website on their permaweb - even if it’s $16 per GB, that’s not a big issue for a small static website (saying that, I haven’t actually tried it, but it looks easy from their website).
There must be reasons Arweave, IPFS, IPC just popped up from the ground around a crypto boom, slapped crypto onto their projects and YOLO.
This also raises the issue that if I upload some pdf of tech specs privately so I can access it anywhere, but others have also uploaded it because they work with the same microcontrollers, then I delete my copy, what happens to theirs, poof gone.
To have dedup we cannot have broad deletions. With private files encrypted with your account blob key then its possible but we don’t even have account blobs yet. And still there is a possibility of dedup happening with temp files and encryption with account blob key. For instance you might have 1000’s of private files and many duplicates in different directories, so you say to yourself lets clear out a couple of directories since the files are stored in other directories. Bam you just zapped ALL copies of that private file.
Deletion is such a dangerous thing in a global system with dedup
So do we remove dedup and assign a unique address to each chunk? What about the huge savings dedup brings. How many people remember to delete files? Why delete when you already paid.
tl;dr only the few will ever delete their files to save space on the network, most people, like 99.9% of people only delete when their disk is filling up. And since they already paid for the files there is no incentive.
And for temp files (like editor program’s temp files) well your disk is still there for that.
Yep and also to cover the cost of running their nodes.
No this works with 1% or 0.0001% of the market. People generate more data daily, it works with a small %age or 100%. As long as Autonomi keeps doing what its meant to do then there is no reason why people using it regularly will not keep using it. If the number of people keeps dropping then the network is dying or dead anyhow with or without dedup (dedup also requires persistent data)
We need to get it out there and see what happens. The cost for persistent data has had its case shown to be viable from history of 60+ years and emerging technology. As opposed to gut feelings it won;t be. Dedup isn’t needed for it to work, but it will be a huge benefit.
If cat videos/memes are anything to go by then it will be a huge amount of saved storage space.
Yep, its the cost of supplying nodes that jacks the price up.
With Autonomi, using spare resources means its almost zero cost to supply storage. This alone defeats most arguments against persistent data left alone all the evidence that even paying market price for new drives it still is viable. But when one upgrades their computer they get bigger drives anyhow, no cost to the person to have more space to run nodes after normal upgrading.
a Couple of Comparatives
Amazon is 50X the cost of Media per GB stored on their IOone Fast Storage service which is really Cached in Memory and stored later, really a short term service, where cost is plit priced with 80% of the cost for IOPs speed, the other 20% of the cost is for storage capacity.
PUREaaS is US $40.00 per GB stored and that is monthly, stored on not so fast PURE Storage NVME Hardware.
imo Autonomi should be offering a competitive Step change 2X price decrease compared to PURE, at a minimum, given the speed of upload and download is what it is today, really competing with HDD and Tape for ‘archive like’ business and those services are really cheap monthly per GB (Checkout BackBlaze the king of cheap backup service in Silicon Valley for a price point)
As for ‘term limited’ storage, then one is ‘in it’ against Amazon, and their speed is undeniably FAST, so Autonomi would have to price that type of offer accordingly, much less, and position such an offer primarily as archive storage , and not speed, given the current upload and download speeds observed, and be competitive with tape and hdd ‘backup service’ monthly price points from the market leaders in those spaces (to grab bigger biz type relationships and deals).
For sure the truly the Autonomi distributed nature of the pay once, store for ever offer combined with Quantum Computer Hacker Resistance Encryption is a seller, versus just speed compared to existing backup services.
The question is what price can be commanded by Autonomi for said competitive difference in the current case and the propose case? (the first is Unclear at the moment, the latter needs some more thought imo)