Sacrificial data vs non-permanent data

tfa · August 9, 2015, 8:19pm

For a successful usage of the safe network two main incentives must be properly implemented:

Incentive for users to put data on the network => cost of put operations must be minimized.
Incentive for farmers to provide resources to the network => reward for responding to get requests must be maximized.

These incentives are contradictory, not directly because a put operation is not the same as a get operation, but indirectly because put operations add data on the network which allow get operations to be requested on these data. So the system is more complex than a simple supply and demand problem on a common resource.

In the short term farmers will be given new safecoins without impacting the cost of put operations because there are almost 4 billion of them in reserve (there may be an associated inflation problem with them, but let’s ignore it). In the long term there will be fewer of them and the majority of safecoins given to farmers will be recycled ones. An equilibrium between costs of put and get operations will have to be reached. The main problem is the data duplication (each chunk is stored several times):

In the past I have read that each chunk was stored 20 times (4 online + 16 offline)
I have also read there is a 20% reserve of storage space free
More recently three kinds of chunks appeared (normal, sacrificial and backup)
In any case there is an important percentage of data that is never read again (that will not generate get requests)

I don’t know the multiplier value but I suppose it is an order of magnitude (like 10 MB of storage is needed in the safe network to store a 1MB file). With such a multiplier it seems hard to content both the users and the farmers and I suspect it is the origin of the pessimistic formula given in RFC n° 5:

int cost_per_Mb = 1/(fr^(1/5)); (5th root, integer value)

For example if FR is 100 000 then the cost of one TB is 1000*1000/10 = 100000 safecoins ≈ 2000€ at current price. This is very expensive (for comparison Dropbox is about 100€ per year for one TB) and yet the chosen value for FR is very high (one reward request every 100 000 gets !!!).

These values are a blocking problem, because no useful equilibrium can be reached and both users and farmers will leave the network:

Professional farmers will have negative ROI
Casual users won’t store data because it’s too expensive and besides they won’t be able to farm a single safecoin

I think that non-permanent chunks can solve this problem. What I propose is not a fixed time limit but a security measure: deletion of the oldest chunks will be triggered only if the safe network runs out of disk space. Each data chunk has an associated last payment date: for private data the user has to pay to update this metadata and for public data anybody can pay.

This solution comes in addition to sacrificial data. When the network runs out of space first the sacrificial chunks are deleted and then the oldest chunks.The network is currently supposed to work with permanent data that do not need to be deleted. This proposal is only a security measure that would be useful only if this assumption is not true. In this hypothetical case it is better to delete old chunks instead of random chunks (because this is what will happen if people stop their vault when there is not enough space to store the chunks they held).

The advantage of a payment to keep old data is that safecoins are recycled when users pay to update the last payment date. This is where the blocking problem is really solved: this afflux of recycled safecoins will increase the probability to get one at each reward request and so new revenues will be added for farmers for maintaining the safe network.

I don’t know how the formula giving the cost per MB was computed and so I don’t know what it will become under this new condition but I suppose that it will be more user and farmer friendly.

By the way it would useful if Maidsafe open source their simulations so that we understand how the formula is computed.

Edit: Previously deletion of old data replaced deletion of sacrificial data, in the new version it comes after.

bcndanos · August 9, 2015, 8:57pm

I completely agree with you. There’s a problem with current incentive/reward system. It’s not sustainable.
I proposed a RFC with similar thoughts a month ago, but It wasn’t accepted. My idea was a system similar to the ‘gas’ in the ethereum system, where farmers/miners set their minimum price to process/store the data. And on the other side, the client sets how much the want to pay for the same.

digipl · August 9, 2015, 9:41pm

That’s no possible because you broke the fundamental security of the SAFE network, the chunks of data is not linked to an owner. In you system the chunks of data must have an owner so you link the information and the proprietary.

Your solution is not the SAFE network is another network and not safe.

Seneca · August 9, 2015, 9:45pm

4-6 online now, none offline. Vaults are not persistent anymore.

Here also, 4-6 MB.

As far as I understand it, the formula is just one used for testing in one of the test nets. A dynamic algorithm rather than a fixed formula is required eventually. This formula is btw based on a guesstimated GET per PUT ratio, not on the amount of redundancy in the network.

You’re reading the formula incorrectly. ^(1/5) means to the power of 0.2. Also, on what do you base that a FR of 100 000 is realistic?

tfa · August 9, 2015, 10:00pm

There is always a link because otherwise an owner wouldn’t be able to retrieve the chunks of a file. But the link is unidirectional: we can go from the owner to the chunks but not the reverse. Otherwise I agree the network couldn’t be called safe anymore.

This unidirectional link is enough to update a last payment field. I think this is already the way that Structured Data can be updated by their owners while remaining safe.

bcndanos · August 9, 2015, 10:02pm

Ummm. It’s not true. There are owners in the ‘Structured Data’.

struct StructuredData {
…
owner_keys : mut veccrypto::sign::PublicKey // n * 32 Bytes (where n is number of owners)
version : mut u64, // incrementing (deterministic) version number
previous_owner_keys : mut veccrypto::sign::PublicKey // n * 32 Bytes (where n is number of
owners) only required when owners change
signature : mut Vec // signs the mut fields above // 32 bytes (using e25519 sig)
}

Yes, I noticed that. That’s why I deleted the RPC by myself.

Ricmaric · August 9, 2015, 10:05pm

The numbers are correct
100000^0.2=10
1TB=1mil MB

And when you lower farming rate yiu get even crazier numbers

FR at 1000 gives you 250000SC/TB

digipl · August 9, 2015, 10:07pm

The chunks are Not Structured Data. And ,probably ,the vast majority of network SAFE data.

tfa · August 9, 2015, 10:14pm

I am not talking about persistant vault. I am talking about permanent data.I propose that the older data may be automatically deleted under a stress condition on remaining disk space.

100000^(1/5) really equals 10

I took 100000 because the result is simple. For this value the put cost is too high and yet the reward is too low. A different value will generate either a higher cost or a lower reward.

Seneca · August 9, 2015, 10:15pm

Apologies, I didn’t bother to fill it in.

Okay, the formula seems badly chosen. I still don’t think the OP’s solution is required. This is what I posted under that RFC a few months ago:

Another way to look at this is considering the SAFE network as a
decentralized autonomous (non-profit) organization. The SAFE network has
income (from PUTting clients), expenditures (rewards to farmers), and
capital (non-issued SafeCoins).

At some point in the future no more new SafeCoins can be issued, but
SafeCoins will still be recycled. This means for the SAFE network to
maintain equilibrium (not “bankrupting”) in this late stage, income
(over any given period) needs to at least match expenditures. So in
general, the network should ask just enough SafeCoin from PUTting
clients to be able to accommodate vaults. This way the absolute lowest
PUT price will be found, maximizing accessibility for all.

One way to do this is having a target amount of SafeCoins in
existence. At the late stage of the network, this could perhaps be 99%.
If over 99% of SafeCoins are in existence, PUT prices would be raised,
if less than 99%, PUT prices would be lowered. The remaining 1% would
function as a buffer to protect the network against sudden fluctuations
in demand (amount of PUTS).

If such an algorithm would be used, it would seem logical to me to
also have a target amount of SafeCoins in existence before this late
stage, right from the start. The target could be derived from any
variable or combination of variables the network can autonomously
measure or approximate (for example, current total network capacity).

As an added bonus, if derived wisely, this usage of a target amount
of SafeCoin could also protect against potential malign high inflation
rates in the early days of the network, when a lot of new SafeCoins
would be issued. I must admit I’m not aware how predictable and balanced
the current algorithm of issuance of new SafeCoins is, so maybe malign
high inflation is already impossible.

So basically, what I propose is an algorthm that sets a target percentage of SafeCoin in circulation at any given time, and the PUT price is adapted to aim for that target. The GET rewards already balance dynamically. This way we also get a dynamic PUT price to match it, achieving income/expenditure balance of the network.

tfa · August 9, 2015, 10:30pm

OK, StructureData has an owner_keys field but, I think they remain anonymous. Besides the chunks don’t need such a field, anybody can send a payment, there is no need to control who is it.

The network remains safe with this proposal, your RFC deletion was unfortunate.

digipl · August 9, 2015, 10:42pm

when?
If you don’t control, How did you find out that you have to pay?
The chunks have not date information, which you delete and which do not?
If a chunk is deleted it can affect thousands. How do you control that?
Etc, etc, etc…

Sorry but on your idea the SAFE network become an uncontrollable nightmare.

Ricmaric · August 9, 2015, 10:59pm

No but seriously when will we know how much space a safecoin buys and how and who will decide that or is this it 1SC/MB?

tfa · August 9, 2015, 11:03pm

When:
It’s a UI problem to present a dashboard showing the files that might be deleted if the network is about to run out of disk space. User configured programs can also automatically do the payment when necessary.

Who:
For private data the owner will pay. For shared data anybody can pay.

Nightmare:
People are used to paying a recurring fee with traditional cloud storage service (Dropbox, Google Drive, OneDrive, …). If presented correctly in the UI this isn’t a nightmare.

Besides the network is currently supposed to work with permanent data that do not need to be deleted. This proposal is only a security measure that would be useful only if this assumption is not true. In this hypothetical case it is better to delete old chunks that people don’t care about instead of random chunks (because this is what will happen if people stop their vault when there is not enough space to store the chunks they held).

Seneca · August 9, 2015, 11:20pm

It is definitely not 1SC/MB. We may get an indication in the test nets, but we’ll only be sure when SafeCoin launches for real. This is because it is calculated dynamically in an attempt to keep the network healthy and growing under varying conditions.

digipl · August 9, 2015, 11:50pm

Files? What files? in the SAFE network you don’t have files. You have millions, billions or trillions of encrypted chunks. You pretend that all the users be connected all the time to control in real time if one chunk will be deleted. To do that you need to ask continually about the status of thousands or millions of chunks. And all the users must do the same in the same time.
And if you automatize you broke the basic security of the SAFE network linking the information and the proprietary.

Nightmare not, something worse.

P.S. I beginning to thinks that you don’t understand that the SAFE is a completely distributed network. And the rules of a client-server solution don’t work here.

wes · August 10, 2015, 12:29am

I think what he’s saying is that when you log in, its the job of the UI you’re looking at (that can tell what chunks you own) to tell you that you need to pay your “upkeep” fee. That is possible, but I’m entirely against this entire line of thought.

digipl · August 10, 2015, 1:29am

And if your not log in in a week o a month? Who can trust a network who can delete your data in any moment if you are not aware of them?

Why only on log in? A distributed network doesn’t know when will need beginning to erase chunks. So you, and all the other users, need to control permanently all the chunks. That’s millions and millions of request per second and a fatal stress for the network.

And what chunks? To choose each chunk you need specific information about time and payment, and in the SAFE network doesn’t exist time servers. You need those servers and create a dangerous attack vector.

You need to modify the chunk file structure to add metadata.
Etc…
Etc…
Etc.

This is not a small change. This affect the basic functioning of the SAFE network and only to worsen.

whiteoutmashups · August 10, 2015, 2:57am

what does this mean?

Onaka · August 10, 2015, 5:28am

When a vault goes offline and comes back online, it may not serve the chunks it still has on its hard drive. It must acquire a new location for itself in XOR space and thus completely new chunks to store.

Topic		Replies	Views
SAFE Storage economics - one-time fee, forever service Features	91	7104	August 6, 2016
Permanent data storage + lots of nodes falling away Features	38	3503	March 24, 2018
Will the data be stored forever? Features	35	5641	June 1, 2016
Imbalance between PUT-s and GET-s can cause trouble? Features	55	3592	October 16, 2015
Data Recycling Incentives Features	67	3569	March 30, 2015

Sacrificial data vs non-permanent data

Related topics