That is, by only uploading to one node the risk is that the node goes off line before replication for some reason (computer crash or whatever) and the chunk was uploaded and paid for but now lost. On a multi-million node network this would surely (don’t call me Shirley) be happening often enough to be noticed regularly.
Or am I missing something still.
My thought is that this is configurable in the client.
Take first price and always use that - Fastest
check with 2-5 nodes for price (number configurable) - Slowest
Only if the price is not similar to previous records uploaded does it check with more nodes. Safest
Just one. The cheapest (or as desired by client if they want faster security).
No, if this works it’s the same as paying 3/5 really, there’s no real concrete difference I think, beyond more network msgs.
Some things (spends) will still require going to all nodes eg. But for self validating data, it seems there’s no real need. So less back and forth, faster upload and hopefully saner verification. (As well as verifying the replicatio processes)
It was a realisation when looking at paying 3/5. Why stop at 3/5. Why are we doing it? And as @happybeing notes, its a nice simplification and spreading of work across nodes.
We’ve hit this in our churn tests and are accounting for it in terms of how what we deem ‘verified’. That process will require a threshold of respondents.
Yeh, that’s more or less what we imagine it could develop too.
Right now, no-verify is useless, so that would be come fastest, ie, with no validation over the put beyond 1 response.
secure may be 3/5.
And yeh, price checking can happen as well. (Right now we’re doing 3/5 store costs and taking cheapest eg).
You mean a chunk gets uploaded to 1 node, that then starts the replication to 5 nodes - and once a threshold amount of them (like 3 for example) can return the chunk, it is deemed verified?
Proper storage systems don’t acknowledge a write until the data is safe. It’s the only way to fly.
On the old style hulking disk arrays sometimes to the extent of it being replicated to a remote site a number of km away before a write is acknowledged at the local site - Synchronous replication. About 30km is the practical limit for responsive synchronous replication. Or sometimes a write is acknowledged when the data is fully resilient at the primary site with the data arriving at the secondary site a few seconds later - Asynchronous replication.
On the newfangled distributed systems - whether cloud based or on prem in a DC - it’s sometimes to the extent that the data has achieved the same level of resilience as all other data. Or sometimes it’s that it has achieved some decent level with full resilience to come shortly afterwards. But usually it’s full resilience that is acknowledged because it doesn’t take much longer and it’s judged to be not worth the risk.
Nothing but nothing that is sensible these days acknowledges a write when data can still be lost by the failure of a single component.
I’m sure that this system should aim for that.
I’d kind of be ok with it being configurable to get a faster write or a cheaper write if there is no resilience to start with for use cases where instant resilience isn’t important. But it would get used too much for things it’s not suitable for and would leave the door open to unexpected data loss, complaints and a bad reputation.
There is a block access disk array family I used to work on a few years ago - EMC Symmetrix - that had a highly unusual approach. Whether using synchronous or asynchronous replication the write to an array was deemed to be safe once it had been written to RAM!
Whaaat?! I hear you say. But the write had to be in 2 cache cards in the system. These weren’t DIMMs in servers. These weren’t anything like the ‘dual controller’ systems that were basically just 2 big servers connected to lots of disk enclosures. They were a completely different architecture. The cache cards were shared with the multiple frontend controllers and multiple backend controllers in a matrix configuration where everything was routable to everything else and all the disks. It could be minutes before data was written to disk. These arrays had their own batteries in the bottom that could power the system for 30 minutes while the whole cache was written to disk in the event of a power outage. People would be amazed to see arrays still running after an outage in a DC that had taken out the DC UPS system and left everything else down.
I never heard of anyone having data loss because of this unusual approach of a write being acknowledged while only existing in volatile RAM.
If Safe wants speed could it do something like that? Acknowledge the write when it has got to RAM on multiple nodes and then it is written to disk to achieve full resilience?
On reflection though I don’t think that would bring much advantage in speed for Safe. The majority of the time taken to make a write will always be network transmission so an extra few ms to get the data onto disk on each node will be surely be insignificant and not worth the extra complexity and risk.
If there will be no price difference in price I think it will be fine (always pay 1/5). Default verification 3/5, but user changeable. Power of default settings is strong, people are lazy.
I gathered from @joshuef 's reply that the record is not deemed written until it is read from the network. As to how many nodes have it I think is one of the things to be tested. The beauty is that this is a part of the client and if the standard client is not safe enough then a mod will quickly be incoming to fix that I am sure.