Non-persistent vaults

dirvine · May 14, 2015, 11:50pm

That’s horrible, with the upload throttled like that I imagine it’s a download only cap and not upload. It’s still horrendous though and we will need to work out some way around some caps etc. It’s crazy though Viv was showing me hi pals broadband in Holland 1 Gb up and down, here we have 100+ Mb and China now is pushing to get that fixed there to. It’s a real bad thing for business etc. to have these limits. I know we can check for already existing data and not try and send any if the first last and middle chunks are on line, which helps a little. For private unique data though it’s another story. Worth some consideration here to see what can be done.

Gives us something to think about though

Blindsite2k · May 15, 2015, 12:07am

I don’t know of an ISP around here that doesn’t cap internet download usage. It usually goes somewhere between 200gb to a couple TB I think.

xplorenet is even worse with a 20gb top downlaod cap.

Seriously tho man I think without some careful thinking and planning non persistant vaults might kill SAFE in North America or anywehre else that had slow internet or an anternet cap.

dirvine · May 15, 2015, 12:14am

Regardless we will get measurements in any case and work that out for sure. Any way it’s cut a vault going off line will lose some chunks and need to download more if farming. Measurements seem critical, but I seriously doubt there will be an issue. A lot depends on take up and distribution of data.

jreighley · May 15, 2015, 1:33am

I bet Satoshi is glad he retired before the Block Size debate sprung to full life…

This reminds me of that.

I suspect farming will find a way to be economically effective no matter what the ground rules are… The currency ought to guarantee it… No matter what the rules are some people will like it and some people wont… On the other hand the cost of running the network will be what it is, and the currency will reflect that truth as well…

dyamanaka · May 15, 2015, 8:36am

Thank you for bringing this up.

I never noticed my own usage until I checked. My ISP does have a 250GB (threshold). If I exceed it, they give me an additional 50GB and charge me for it.

At least my service is not interrupted. But I have to stay below the threshold to prevent my bill from increasing. Thankfully, my average usage is around 100GB per month, which allows 100GB for SAFE farming. It’s prudent to reserve 50GB for unexpected usage.

100GB personal + 100GB farming + 50GB reserve = 250GB

Chunk Collection Rate
This raises a very important point for farmers. We have to calculate how many vaults our (data cap & bandwith) can support. If a vault receives 1GB of chunks per month, then I can run 100 vaults. Data caps should be considered when farmers setup their vaults.

Ideally, I’ll max out my farming data cap of 100GB per month. My 1TB drive will have to remain connected for 10 months in order to get full. That is unlikely as my connection drops every 2 months, LOL!

This doesn’t mean I’m against Non-persistent vaults. It just means we (farmers) have additional considerations, the cost of running a farm. I hope we get Meshnet up ASAP, cause Google fiber is looking good once mass adoption hits.

I agree with the goal to get a little bit of Safecoin into everyone’s hands. We’ll see how it pans out in TestNet3.

BenMS · May 15, 2015, 9:28am

One word to bandwidth caps and speed throttling by ISPs: mesh-networking in CRUST

I will also note that for people who like living in the city centre of a city: the old part usually (at least for Brussels and Glasgow) lags about 5-7 years in internet connectivity. So Australia, you’re not alone

On the upside the potential for WiFi mesh networking in a crowded city centre is much higher !

anon40790172 · May 15, 2015, 3:05pm

Yup, they used this in Hong Kong during the protests. Firechat was the thing to be on. Only problem… You’re connected until you’re not. It’s only in a range of say 20 meters. So when you walk through the city, this is gonna be a problem I think.

Another thing about the non-persistent vaults. SAFEnet is routing the Chunks in a way that the little 1MB file is coming to you from a vault in China, over a Hop in Japan, than through a Hop in Germany and finally to your computer (it can even use up to 6 Hops). So when we think about Netflix consuming 30% of the US bandwidth, what about all these nodes that go off- and online? That’s a lot of Chunks moving around! If we add that to the Chunks going over all these Hops over the planet, aren’t we filling up way to much bandwidth?

happybeing · May 15, 2015, 3:13pm

This is an interesting question. It can be balanced by the network favouring nodes that have been on for longer, which is good for encouraging this. Question is, will a ceiling +20% of average farming reward be enough to compensate those nodes for the bandwidth they free up by being online for long uninterrupted periods. Also, taking a lot of such nodes offline all at once makes for a significant potential attack if the balance is not right. Not so much from losing data, but as a kind of DDoS. Maybe not worth it, but it depends on the severity of this issue. It would not be great to create a new non-DDoS’able network and then find it being periodically stressed in this way when a big actor wants to drive down its popularity.

janitor · May 15, 2015, 3:55pm

I’ve arguing here for the longest time that telcos won’t be stupid and they’ll farm the farmers. Some expected they’d idly stand by while people are acting as mini ISPs.

I think that eventually all busy telcos are going to intro “reasonable” caps. Because farming will move to telcos that don’t have caps so they’ll quickly figure out something’s wrong.

It’s funny how net neutrality proponents here (not you) expected that NN would mean they can spend 50x the average and pay 1/50 of the cost. Artificial scarcity

Bottom line is, similar to Bitcoin miners, we’ll all one day have caps and deal with them. The cost of traffic will passed on to users.

anon40790172 · May 15, 2015, 4:19pm

If the SAFE-client allows you to connect to ip’s of the same provider, things can work out fine for them. Could even take a lot of load away due to caching.

janitor · May 15, 2015, 5:09pm

That’s certainly possible.
If you happen to be acting as a cache for a client connected to the same area network that would benefit the telco and the client, so in theory you shouldn’t be charged.
But it’s possible that they wouldn’t charge you in practice, either, because you’re not leaving the network and internal network overload is much less likely to happen.

BenMS · May 15, 2015, 5:16pm

the beauty of MaidSafe is exactly here: because we have an overlay network that maps over the IP layer, it exactly creates that fluidity that is currently missing for mesh networking. Your client ID can simply hop from IP/WiFi broadcast point to the next IP WiFi signal, or multiple at once. For delivering services this is perfect because you have a consistent ID to work with (your MAID), regardless of what the underlying IP layer is doing.

And if you temporarily get disconnected, the autonomous network will have carried on and you just catch up to the latest results since you were disconnected.

BenMS · May 15, 2015, 5:38pm

Yes, the first concern that came to me when evaluating Davids proposal was also the increase in bandwidth usage. But this already compensated by two factors: more smaller, average sized vaults distributes this load, while strengthening the network. Secondly, archive vaults (the pmid nodes) can evolve to exactly that, archive vaults. We know the majority of the data humanity produces is not accessed again. Firstly the DHT nature of the SAFE network helps reduce this volume of data, but more importantly we need to let the network decide what data has to be (only) archived and what data also has to be fluidly available (with reduced latency);

Remember that the long-standing logic still holds: all data is archived, but if it is archived on a node that is stably online, and never requested again, that data will fall out of the ‘fluid phase’. Until a data chunk has found a node where it can settle on it will indeed keep flowing around.

The flow of data caused by data being requested is not affected by persistent or non-persistant vaults. Neither is the amount of data flow caused by churn events; in fact it is only better spread over the network resources. The network still asks for 4 live copies of the data; the only thing we loose is keeping track of the old-dead copies. We are only removing useless state from the network by introducing the non-persistent vaults, which causes a reduction in overhead to maintain and synchronise.

People also seem to think that if you’d attach a new vault of given size X to the network, it will start sucking a data volume of size X into itself. That is not true. The network only pushes so if it finds space on your newly attached vault it will push new data into it at the rate that the network dictates. Your vault cannot request to store more chunks. It is only ever commanded to store a new chunk, and it will until it is full.

jreighley · May 15, 2015, 5:52pm

@BenMS, So when you turn on your vault and it gets an ID it will not immediately suck down all of the files that it is closest to – It will wait until one of the nodes holding those files turns off, then it will be assigned that file?

Do I kinda get that right?

anon40790172 · May 15, 2015, 6:08pm

CRUST in action! I saw the video by David, talking about CRUST and that it detect’s if a node goes offline in milliseconds. Really great! So while walking around on the streets, you connect in a MESH-way, maybe to some others only 2 or 3 minutes. In that time CRUST will use it to PUT/GET a couple of Chunks, while at the same time it uses connections to others as well. So if one goes down, no problem, it will use the others that are still live? Really magic, and now I really want to see the stuff in action!

BenMS · May 15, 2015, 6:19pm

This is an interesting phrasing of the question. Whichever way safecoin flattens this earning rate for above-average vault, your question is a right one to ask. A rephrase is: what is the value of storing legacy data that might not ever be requested by an explicit get_request again? The first answer is of course, if ever it is requested and you held on to it more than others (by staying online) you increase your chances of farming that coin - remember that if you stay online more than others you can fill your vault more and increase your farming rate (by having more chunks, as it always has been). So the non-persistent vaults actually give a safecoin advantage to always being online, which is valuable to the network.

A much stronger argument for wanting to keep your vault online is a new one - I predict: distributed computations. Not only is it obvious that by burning CPU cycles for distributed computations (paid for with safecoin) is going to be more lucrative than holding out for that rare request for a forgotten chunk. It also scales linearly with the amount of computation performed, much better than burning your CPU just to keep your hard disk spinning, sort of speak.

It gets better though: humans might show the behaviour that the majority of the documents we produce are never accessed again later; machines do not. In fact the landslide of freshly produced data in the world is not in word documents and spreadsheets typed; it is in sensory data stored by machines, measurements of all kinds. And this is exactly where the future ‘gold mine’ is: these legacy data-sets will not be gotten and transfered over the network, instead the computational work will come to those nodes that have carefully held on to that data. This is where in SAFE 2.0 you might just find a new competitive edge.

happybeing · May 15, 2015, 6:38pm

This has always been the case I think: greater availability → higher rank → higher rewards

So my question revolves around the shift to competing with a lot of devices which have negligible cost. It would be fine if we didn’t need these archive vaults, but I’m thinking now we need them more than with persistent vaults (where minor outages wouldn’t cause loss of chunks).

I suppose a better phrasing would be: If there aren’t enough archive vaults, how can the pricing be adjusted to encourage more of them? Because if you adjust the rewards for everyone, you aren’t really incentivising archive vaults, so you get an open feedback loop.

Do you know something we don’t? So far I’ve only heard this as a possibility, and not specifically as distributed computation in the core, but rather distributed apps which is not quite the same.

BenMS · May 15, 2015, 6:52pm

The majority of first generation nodes most likely will be desktops, laptops and even cloud hosted nodes (although data on the cloud is relatively expensive so they are in a disadvantage - even if you own that data center because then renting it out to businesses as a cloud service is going to be more lucrative than farming, and additionally up to 50% of data center costs are in cooling. Those cooling costs evaporate if you are a desktop or a laptop, by a user). So from that we’d expect that an average vault could easily go up to 500GB for desktops and brought down by say 50GB for laptops (both smaller disks and effectively reduced by being offline more). This might be brought down even more when phones enter the game later on (not just as client now, but full vaults), and maybe at some point we will want to have the network learn to differentiate different types of capabilities, but the first priority for SAFE really has to be to spread out over as many different nodes as possible and make sure that centralising forces are compensated with decentralising forces.

Again for the network it is better to loose (a large number of) nodes with an average vault size compared to the average bandwidth capability, then (a large number of) nodes with a big vault size compared to the average bandwith capacity. So non-persistant vaults both push the vault size towards the average bandwidth capability; it also opens the network to up for millions of more devices (such as charging phones) and embedded devices; their numbers are well worth the consideration.

The first reason of course is: “What we don’t code, makes us stronger”; less state in the network is always going to be more resilient and secure.

wes · May 15, 2015, 7:01pm

I’m curious about what happens to very small (phone) vaults that never gets turned off. My phone rarely ever gets restarted. It also has very small storage. If it becomes an archive, it’s small nature means I’ll get very very few hits… Would this be an incentive to turn the device off more? If I chose not to turn it off, would I be compensated more for a hit? Have I missed something?

It seems like small storage space would have an advantage to restart vaults periodically so they are more likely to have “fresh” data?

It seems like the only advantage to being an archive in 1.0 is being big.

BenMS · May 15, 2015, 7:06pm

Yes you got that right. When you turn on your vault and you get assigned a new ID all the existing vaults will indeed start informing you about which chunks are stored and all other account information about client accounts and managed nodes, but this is just 10’s to a few 100’s of MBs (depends on how much data and accounts the network holds). The chunks itself will not be moved and pushed to a new ‘closer’ node.

So as an example consider that no nodes ever go offline and at a first time some chunk is stored redundantly close to the name it has. Now imagine that many more nodes join the network, so in effect the name space becomes denser and denser populated with new nodes. What happens is that the new nodes closest to the name of that first chunk will be told who the original nodes are that hold that chunk. But those old nodes will have left the close group of the chunk they are holding, because new nodes are now closer.

This example shows how information of where the chunk is stored is passed on, but the actual chunk does not have to move for that.

Topic		Replies	Views
Vault design for the coming releases Ant-Node (was Safe-node)	9	1746	February 8, 2017
QQ about how SAFE will protect against this in the Network (Farming related) Ant-Node (was Safe-node)	14	1145	May 10, 2018
Farming with different IP addresss Beginners	7	1123	March 20, 2015
Pre-RFC Suggestion: Adding a memory persistence model to SAFE Network Development	7	1029	January 6, 2016
[closed] Poll: Should 'vault' really be dropped or be replaced with a new term? And the latter, with what term? Community	11	606	November 27, 2020

Non-persistent vaults

Related topics