PARSEC and 99% Fault Tolerance

Nigel · August 15, 2018, 12:04pm

Could this be implemented into PARSEC by having neighboring sections watch consensus? Still reading Vitaliks site about 99% fault tolerance here.

As an aside, wasn’t Vlad Zamfir giving crap on Twitter about assumptions? I guess that was about syncrony assumptions so perhaps different.

Also sounds as though time is a factor which may be incompatible with the SAFE Network.

Thought this was interesting and hoped to provoke some discussion amongst the more thechnically minded here such as @mav @tfa @tobbetj @neo @bart @pierrechevalier83 @anon86652309 @qi_ma others as well obviously.

Andre_Gerome · August 15, 2018, 12:15pm

It is true that binary byzantine consensus can be made 99%'ish fault tolerant, like it was done in the seminal paper of Lamport et al.

However total order systems can not. At least not under current assumptions. In the paper: "On the Composition of Authenticated Byzantine Agreement " by
Lindell, Lysyanskaya and Rabin (2004) it was shown, that no matter what the number of byzantine users is in the binary system, serializing or parallizing many of thoses decisions changes the number and the best achievable outcome is 1/3…

So any attempt to break that bound has to enrich the system, beyond public key crypto…

tobbetj · August 15, 2018, 1:27pm

As you tagged me Nigel I will give my two cents. I hope not to disturb a deeper discussion in which I can not follow. Actually tried, around PARSEC development to write in genenral logic how I thought security within groups could be achieved. One of the things I thought of was neighbour groups to overwatch consensus, maybe it effects efficiency way to much or can not be achieved, would it be possible for neighbouring groups to randomly check consensus of groups? This is just ideas from my head, I lack knowledge to go any further or if it is even possible. I think it is probably best to leave further discussion to the master wizards, mav, tfa, neo, pierre, bart and others. I just thought it was fun you tagged me Nigel in a technical topic I have had some ideas about.

pierrechevalier83 · August 15, 2018, 1:30pm

Thanks for your post Nigel. I skimmed the article the other day. It seems to me that you trade off your 99% fault tolerance for pretty strong synchrony assumptions.

Like @Andre_Gerome is saying, there isn’t getting around the 1/3 limit for ABFT (and Vitalik doesn’t claim you can get around it either). More like practical things you can do by relaxing some assumptions.

We kind of already solve some of the same issues by having node ageing and elders: by reducing consensus to elders, an attacker must control >= 1/3 elders which is much more expensive than controlling >= 1/3 of nodes in the network. Of course, ways to improve even more could be worth considering.

Like you’re hinting, our next priority once we get some time to get back to pure research on PARSEC will be to address the issue with our synchrony assumption: our proof for Liveness is slightly hand-wavy for the part that touches the concrete coin and we’re going to want to make that rock solid before introducing anymore complexity in the algo. We have good leads for this. We are simply prioritising a production ready, working implementation of what we’ve already got written up at the moment before spending much more time on exploring these to a full extent.

Just so it doesn’t sound like we’re just being sloppy, the Liveness proof handwavyness we have is no worse than the one found in competing DAG based implementations. (Some decide to actually use a concrete coin, but this has synchrony assumptions that inhibit its use in a dynamic [i.e: truly permisionless] network; some just fully handwave their way out of the situation)

Hope this helps.

Nigel · August 15, 2018, 2:20pm

Well we’ll let them wave themselves goodbye out the door then I don’t have any doubts about PARSEC whatsoever. I’m not anywhere near technical enough to verify for myself claims made in the paper but from a layman it still seems sound and Maidsafe has been nothing but honest and hard working in the four years I’ve been following so I go off from my own ‘assumptions’

I figured there would be a trade off and what’s interesting is it’s harder for you guys to quantify security in the same ways with node aging, etc, compared to other specific algos of other projects. At least that’s what it seems like to me. Which is a shame because you can see how it all will work but you don’t have this number you can run around and tell people (that people want to hear).

Thanks for taking the time to read/post guys

zeroflaw · August 15, 2018, 5:37pm

If consensus is reduced to elders and all you need to become an elder is time, an attacker becomes more powerful by having patience? so the cost of time makes this less likely?
So is there a limit to how long it takes to become an elder, if so isn’t this just slowing down an attack?

What about including provided resource usage as part of being an elder. e.g.

Node ‘Bob’ has offered a 500GB vault, having no reputation with the network it will only use 100mb of the offered space. As Bob moves towards being an elder and behaving correctly more space in his vault will be used by the network (as required). Incorrect behaviour will be punished by reducing space usage through lower ‘elder ranking’. So every time a node behaves incorrectly its punished in a way that means it has less effect over the network, and can earn less ‘safe coin’ making the attack more expensive.

This also has the benefit of larger longer running nodes (costing more resources to run) become more powerful elders (as the network trusts them to hold more data). Stopping people spinning up thousands of nodes under a shared storage system, and just waiting to become elders.

With data already being evenly distributed, no single node will become overly powerful as the ‘elder ranking’ is weighted by how much data the network is trusting them with.

JPL · August 15, 2018, 7:01pm

There are various other mechanisms in place to ensure that biding your time and waiting to be an elder is very unlikely to be a successful strategy including random allocation of nodes to a section and churn. Most are covered in this thread.

pierrechevalier83 · August 15, 2018, 7:28pm

Your suggestion is interesting. Just to comment on a couple of points about elders:

time is money. When you force malicious actors to “just wait”, you make them spend opportunity cost. They are literally loosing money compared to more productive malicious things they could do with their hardware like a 51% attack on some other network.
if they invest the time for their node to become elder, they’re now in a situation where farming will bring in sweet sweet safecoins… Do they still want to attack the network or is it now in their best interest to just cash in the reward for their correct behaviour up to this point?

Now, elder or not, the minute a node misbehaves in a detectable way, they get punished and loose their age. So after all the effort and money spent providing resources to the network, all you gain by trying your attack is an immediate loss of all the trust you gained (or you have to outsmart the network and escape detection, in which case you still need to somehow have gathered >=1/3 of the nodes in a section which statistically requires you to be owning a big part of the network with all the relocations etc.)

These are just a few thoughts. Some of the discussion on these mechanisms is still ongoing, so more details will be coming between now and launch

zeroflaw · August 15, 2018, 9:42pm

@JPL I can’t see how the current security mechanisms address the ’ Sybil attack’ which is documented in that post as possible.

If I’ve managed to create a disproportionately large amount of nodes, because I have access to a botnet or I’m just a well organised individual. This attack seems very real to me. Its seems ‘cheaper’ than the 51% attack, CPU time is currently expensive (this might not be true in the future). However, spinning up thousands of nodes with some storage space (which might not even get used) seems significantly less expensive. (and storage being inexpensive is how the safenetwork expects to be economically viable)

What i’m suggesting is nodes that have participated (spent actual resource, bandwidth, cpu, io) in the networks history / security, have influence over the future. Not ‘Idle’ zombie nodes simply existing to achieve a status.

@pierrechevalier83 My worry was the line

by reducing consensus to elders, an attacker must control >= 1/3 elders

Which suggests something bad is already occurring for the network to reduce consensus for what it does next to only elders, elders which get there status simply through waiting, what if it were the elders which are the problem nodes? you’ve just made the attack easier/successful trying to protect the network.

Unlike blockchain (forks), if as an attacker causes the network to damage/delete one chunk of data. It’s gone forever, the safenetwork has failed and is worthless (This alone could be an economic incentive for an attack). Personally, I would much rather see the network succeed.

XOR networking distributes data over the network to protect it, that same mechanism can protect how the data is manipulated later (holding more data equals more influence). Nodes with similar influence would ‘vote’ (including some lower nodes so they can gain influence). If someone holds 51% of the all the data of the network, then they are essentially the network. They have all the infrastructure costs associated with that, which would be huge.

The safenetwork works on the idea that once you PUT data it’s part of the network forever, which suggests that storing of data is almost free. While I see how this economic model would work, I can’t see how treating all nodes as equal (or ageing them only on time) works as a way of protecting the network. The vaults with data must have absolute power, they are the only reason the network exists.

drehb · August 15, 2018, 10:28pm

While I share some of your concerns, the statement quoted above isn’t accurate I think. I believe the vaults will be performing various routing tasks and resource proofs as they participate in the aging process, so it isn’t totally as easy as just waiting and eventually you’ll be an elder. If that will be enough to prevent a Sybil attack remains to be seen.

neo · August 16, 2018, 6:37am

Yes these bad actor nodes will have to be expending bandwidth with routing and passing blocks through their node and so on. Also once they get to adult they are storing chunks and have to be storing enough to be considered a worthy node to be retained.

So resource cost (real $$$ for an attacker) increases as the node ages from infant through to elder. So yes time is a big factor in an attacker’s budget since that equates to having real resources tied up that the attacker has to pay for. So if a month wait then 10000 nodes is 10000 times cost for VM instance, disk storage, bandwidth for a month. This is no small amount. @zeroflaw Now if consider the network will hopefully have 100K or more nodes within a couple of months of going live then the costs are real and large. The reason ordinary folk can afford this is that they are using one by one their spare resources costing them almost zero cost.

dirvine · August 16, 2018, 7:45am

Reducing the attack to Elders means the attack is much more difficult. It is not easier. Think of it this way, a Node starts as a child, then is evaluated over time, how long depends on the networks need to scale. If a child could affect consensus then it would be weaker security actually. However as @pierrechevalier83 said nodes are evaluated, Elders are evaluated in consensus agreements, Infants and Adults are evaluated at the vault level, this means they must perform their allocated tasks and if not then the Elders punish them and possibly reduce their age. This means the “investment” of an attacker is substantial and long.

This time is very important for many reasons, an obvious one would be a botnet. If an attacker got hold of a botnet and was able to act immediately (take bitcoin as an example) then it can immediately cause issues. In SAFE though it would take a long time to get on the network, store data (many botnet devices, particularly IOT cannot hold that data) and perform its duties while waiting a long time to get to Elder status. Botnets will really struggle when the time is involved as many botnets consist of computers for short periods, for very intermittent periods (user switch off) or from low bandwidth, high latency devices. This “wait” period we have where nodes are evaluated actually makes botnet attacks very difficult indeed, relative to many other designs.,

[EDIT - again ]
In terms of sybil etc. this waiting period or network decides when nodes join also means that if there are people trying to get vaults on the network then they are diluted by others also trying to join. That means an attacker with a huge number of nodes to join will dilute across the network with good nodes. 2 attackers improve the situation, by one diluting the other and so on and so forth. It is a strange way to think of it perhaps but does actually provide quite a bit of security of these types of mass attacks.

This is a really good observation. I do not agree cost is free, but hopefully, it tends very much towards that, so apart from semantics, I think we agree there. However where I am not so much in an agreement is age being only time, it is very much time spent as a valuable member of the network, so holding data, caring for it and distributing it when requested or indeed mutating it when requested and the security is provided along with the consensus of your neighbors.

Yes, they are in many ways as holders of data and processing of requests etc. The client’s though is really the one with absolute power, in that they sign to mutate data, pay resources to keep farmers interested and consume the data. So yes again I agree in many ways, but the vaults are pretty dumb by design. They cannot, for instance, decrypt data and mostly have no idea what data they have. So they are like the network drones and just behave as programmed, the humans and the clients are the real powerhouses here, thankfully.

I believe over time vaults will get smarter, but their duty is only to care for our security and data in a way that allows privacy security and freedom of all digital info for all humans on the planet. IT is probably a viewpoint and one of many, but this is how I see things, I hope it helps.

zeroflaw · August 16, 2018, 9:05am

So becoming an elder takes:

Time
Holding data
Distributing data
Mutating data

By doing all of these things, a node is providing value and slowly becomes an elder. While the attacker is playing along they are burning real world resources. Okay, so now I agree that this attack becomes more costly as the nodes have to participate in the network to get to elder status. Which sounds much better than a simple wait time.

So why not directly rank each node on how much data it’s holding, distributing and mutating (which due to XOR networking will be distributed by design). I can’t see why if the network has trusted a node to hold 3GB of data, it may have equal influence with a node holding 1GB of data. Should a close group consensus not be made of a minimum amount of data held. So (in the networks early days), XOR networking has distributed data so the average node holding is 1GB, so the group is made of 8 nodes and a minimum of 8GB of storage. (average means some will be more/less than a 1GB) as the ones which have less behave correctly, the data is slowly distributed to them, sharing the responsibility.

In the real world I would also expect the above ranking calculation to include how often the data is access, mutated. Which should would also change the way the data is distributed.

neo · August 16, 2018, 9:16am

The measure of ability is not in just how much data they can store and reliably retrieve. There is also the speed at which they respond. Also the bandwidth they can transfer data. Also can they keep this up long enough to go through all the hoops to age one level and then another than another and so on.

A node is not ranked at all on the amount of data it accepts, after all it could be lying. It is ranked on whether it successfully retrieves the chunks when requested. So the node has to both accept chunks then retrieve all the chunks asked of it. The rewards are only paid out when the node can successfully remain a node and upon the successful retrieval of whats requested from it.

zeroflaw · August 16, 2018, 9:31am

If a node is lying about accepting data, when data is requested by the network, it will find out they lied and lower that node rank by the data that’s missing, losing influence on votes. As data is distributed evenly, it won’t become overly powerful before its found out.

I would expect latency and bandwidth to also be included in the ranking indirectly, but that would be maintained through XOR networking

digipl · August 16, 2018, 9:32am

In addition to what was said by @neo, the data that a vault could store is similar to other vaults that share the same XOR space. A majority of elders must sign the data block of the Datachain so a majority of elders have the same data.

zeroflaw · August 16, 2018, 9:34am

I thought data was distributed over XOR space for redundancy?

digipl · August 16, 2018, 9:34am

TylerAbeoJordan · August 16, 2018, 9:58am

I posted an loose idea related to consensus a while back, not sure if anyone saw it or not:

https://forum.autonomi.community/t/privilege-wager-market-consensus-addon

As I say in the post, it’s probably stupid … but being an intuitive person and not a super duper math person, I leave it to the extreme coders to consider the possibilities.

neo · August 16, 2018, 10:04am

Yes that is true, but a chunk has only one address and the section responsible for that chunk address has its various nodes storing that chunk. The nodes all have different XOR address so it is spread across numerous addresses. These addresses are not in one location though.

The chunks of a file will have different XOR addresses and so data is spread across XOR space. The point is that one chunk has one address and one section responsible for it and those nodes are around the world.

Topic		Replies	Views
Step-by-step: the road to Fleming, 2: Sybil resilience Development development , fleming , routing	48	4319	March 11, 2019
Consensus 28/32 or 8/8? Development	169	8824	June 19, 2018
Analysing the google attack Blog Posts	105	19013	May 22, 2018
Regarding a Sybil attack Beginners	35	3750	June 6, 2017
Sidelining Sybil Attacks Blog Posts sybil , malice , medium-reposts	26	1629	May 3, 2019

PARSEC and 99% Fault Tolerance

Related topics