Node Ageing RFC

This is a neat idea :thumbsup:. By accounting for weightage 0 it touches the second point @dirvine mentioned.

What I’m wondering though is, without a penalty to age/rank/weight/voting-right(we got too many synonyms for this :stuck_out_tongue:) after restart isnt the problem just shifted to the next group.

Problem with persistent nodes were, attackers could “buy” other nodes in a given grp to compromise that grp and gain control of the grp after restart. With this approach, an attacker now is just going to buy “high rank nodes” and keep restarting them until they start in their old group and relocate to a random grp(Y) and wait till this Y is the grp they want to target. Now with their rank/age intact they’ve got a larger window of time to repeat the same process until they get Grp-Quo+1 to stall the grp or Quorum to control the grp. by putting a penalty on restarts, yes there is extra churn to handle since the nodes new lower rank is going to get it relocated sooner from Y but that might be essential?

this is a key point. While it seems “harsh” from a normal user POV that “me restarting my computer made me loose half my rank”, in terms of the network a node restart basically invalidates almost all actions from before in terms of how much the network “trusts” this node. It could have been the best node before restart providing data, very high BW and so on. After restart regardless of “how fast” the event occurred, the network trust is basically gone(not related to the duration of restart action) as it can now be a malicious node/no data space/crappy BW and the network kinda needs to start profiling it again to gain its trust back. The rank/2 itself is a bit of a freebie here when rank at 0 on restart could maybe be seen from network pov the ideal case?

10 Likes

I like the idea of this RFC. It looks quite advanced but when you read the logic several times it makes a lot of sense. I have 3 questions:

  • On a regular PC or MAC. How long will it take to do the POW before you’ve created JoiningProof? Is it like 1 minute or 10 minute on average?
  • Same system for AGE. On average; how long does it takes before I get redirected to a new group and be able to Farm?
  • When my age is higher, do I make more Safecoin per PUT? And how much more is it?
1 Like

This should be less than 5 minutes on a small machine (we need to calculate what small means here) so probably between 1 and 5 minutes

This depends on a few things, churn mainly. So how many new nodes start, old ones leave etc. Nodes will always try and farm, but new (infants) will struggle until they settle in a group long enough to be able to satisfy requests. As a node joins a group, it will get the data chain and know what data it should have. It then asks nodes for different chunks until it can “claim” to have the group data (from the data chains RFC). At that time it will be asked to deliver data (and be able to receive rewards). If it cannot deliver data it should then it’s disconnected and back to age zero.

No but you will be able to deliver more data proportionally as you spend less time relocating and getting up to group level data. So a higher age means you settle in a group to deliver data for longer periods, thereby you earn more.

Hope that helps, please keep plugging away at this though, it’s a nice discussion, but probably should be in the dev forum, maybe @frabrunelle can look at that later?

5 Likes

David, are there any natural analogues you’ve been able to draw on here?

Obviously in an ant colony, new members can be trusted by default due to generic heritage, but here, it is like new ant-like organisms arriving from outside and asking to join the community. What inspirations have you had?

Are there such things as “cuckoo” ants I wonder? :slight_smile:

EDIT: many congratulations by the way! :beers:

2 Likes

I suspect that any groups in the natural world that accept new members (even children) will only trust them incrementally as they share the workload. Even then in the early days any deviation from the rules is met with the likelihood of expulsion and start again. Like a wolf joining a pack, it’s treated pretty rough to begin with, if it’s accepted at all.

Even in that case if a member rejoins the group they start fresh. Older well known members though perhaps do net get expelled but they do need to “do time” or at least lose respect/faith until they regain it.

So this feels very much like a natural system from that perspective.

7 Likes

Ahhh, you’ve been studying the forum :wink: good one. Glad to know my time here hasn’t been wasted lol :laughing:

5 Likes

Ants do change roles according to age. Young ones nurse and clean, older ones forage and oldest ones scout and fight. And they also change roles if there is an exceptional need. This is pretty much as safe network is currently being designed :slight_smile:

If an ant misbehaves the group detects this, and they kill it. I don’t know in which all situations this occurs, but one is when a queen tries to start a colony inside the current one. I suspect another one is if an ant is developmentally challenged not being able to function as expected.

3 Likes

Thanks, I was aware they changed roles according to need - current context - but not age. I.e. need more foragers, other ants switch roles to foragers etc.

1 Like

Yes in harvester ants there are at least 50% waiting in the wings to be allocated work.

2 Likes

I would like to throw in some info on the physical application side of this. My plan is to exploit the extensive fiber networks (Two google fiber cities and one open fiber network “UTOPIA” where we will be the ISP) in my area and provide free fiber Wifi Routers (Of course with a Mesh network built in) door to door (would love to also be able to pay their individual internet bills) that will be a unique node (farmer) for the SAFE network. So theoretically our little local network will be able to support the SAFEnetwork with a million households that have a unique/separate 1-10Gbps (symmetric upload/download) nodes that should run pretty consistent (routers) over a long period of time. The only limitation is how much and consistent the reward for expanding a physical network like this would be.

Now I know there will be some programming to go into locking the farmer hardware to payout safecoin to a pre-programmed wallet address…but is there a way of giving age credit to a network of nodes that share the same SafeCoin payout address? Additionally, is there a way of reserving a higher credit for farmers that need to reset but share the same safecoin address that has proven an extensive high value recourse to the network?

3 Likes

Maybe have a system where a node can pre-announce it needs to reset, maybe create a crypto sig, and then reset. Then if the node returns within a reasonable “time” (Qty of transactions?) then it goes on the fast track to regaining its rank. It still has to prove itself as “healthy”, just like an injured sportsman has to, but once shown healthy then it regains its rank quickly.

3 Likes

This is fantastic idea. I2P has this idea implemented. It is call grace shutdown. This notifies other nodes that the tunnel will shut down, so they will need to build a tunnel. Same situation, but different aspects. Replace tunnel with data. Sudden shutdown causes problems short and long run. Yes, we do need additional reserves to maintain the unexpectancy. Given a reason, nodes should be informed of what it is happening to the current situation. This gives future a change to cope the repercussion from the past. Maintain the order of the system.

Nodes should still given a chance to recover, and become a better node. Health should be assigned to nodes, like @neo stated. Another thing to add, class system is easier to maintain the order than free for all. If one plays rpg games, they would understand how each class system can effect greatly. One would be too strong, and others would be too weak. DnD is a great example. Dota, Dota 2, HoN, and LoL is another great example. Each character plays a role of certain classes to achieve great results. The biggest hurdle in those games is that 5 players has a choice to play a role, and if they choose the wrong role, it effects the entire game from start. 5 carries vs well balanced team. Who wins? Obviously the well balanced team. Balance team in team of 5 contains; Carry, ganker, supporter, crowd control, and tank. Lets further to expand this quest… Taking this idea and merge into safenet.

Ganker is a character who can set up ganks (raid), to tackle a problem, to delay the time, to give a team chance to maintain, and dominate the game. Ganker would most likely be used by cell phone users. Users most likely want to communicate, then disconnect. It cost them data for running it. Safenet requires more data than mobile towers.

Supporter is a character who heals other characters, and sets up wards to keep their eyes open of enemies. This is is a stage before the archive nodes. It already had good health, maintain the uptime, and keeping their eyes out on bad nodes. It actually boost for gankers, aka cellphone nodes. They are excellent users entire game. No matter what, it will always be useful even though it is weak. Weak meaning that gankers can get all roles killed if preform terribly.

Crowd Control is a character who manipulate the situation. Delay the carry from attacking, taking down enemies, move characters out of way, and so forth. They would be good for people who only wants to stay on for few hours or so, and will be disconnected for rest of the day. They hold data, then transfer it to another node before graceful shutdown. Gankers become weak in long run. That’s wher carry comes to play.

Carry is a character who carries the entire team to victory. However they are slow at start, and can’t do much. But at the end, it earns items that benefits the entire team and himself greatly, it can single handle change the game, and/or end it. The final stage before becoming a tank. These guys are hold enough to be archive nodes, but not enough to be one because they are not “tank”. They can be taken out by gankers, supporters, and crowd control.

Tank is a character who can take a lot of damage, and still survive. This is the 99 percent uptime for nodes. They can withstand everything that comes to them. The general rule is ignore tank, and focus on any other character. They become the archives nodes.

So we have 32 nodes per cluster (bucket), and we got new RFC proposal which is split group, so there would be 16-16 nodes. Each of the 16 nodes plays a role. Each split group would contain each specific role that I stated above. From my perspective of things, I think cellphone users should be 8 per split group. There are more cellphone users than desktop users. Cellphone users cost safenet survival.

This of course will change when we have open source mesh networking, and professional code that doesn’t take a lot of CPU resources… Then cellphone users can actually become archive nodes.

1 Like

Great point.

I realize this RFC focuses on joining and relocating, but ‘graceful departure’ could perhaps reduce peaky network load

  • sudden departure = unexpected spike in network activity
  • graceful departure (ie announce departure X seconds ahead) = group spreads churn load over X seconds

This way, when imminent loss of connection is known by the vault in advance, the vault has a mechanism by which network load can be smooth rather than rushed.

1 Like

Perhaps it would be prudent to start with a conservative model, with maximum security in mind, then consider these sort of restart issues later?

I understand that we want to spread the load out over the network to nodes of all classes, but I suspect some of these optimisations could be considered later on (after network is stable).

2 Likes

I agree with this point, in the broader sense that we have seen several new RFCs created these days. While this is great for they all deal with crucial aspects of the network (and when I read them I am completely enthralled by the new areas they tackle), it seems each time a new one is created, the date of stable release of the network goes a little bit farther.

While my technical knowledge does little to allow me to comment on whether this or that RFC is critical before going live, I wonder whether one of the most urgent tasks would not be to solve the vault update mechanism. Would this not allow coming RFCs after that to be added incrementally, on the stable network ?

1 Like

I thought that inefficient churn was one of the biggest factors in the failure of the current vaults from home network, along with unequal nodes?

If this is the case, solving churn related issues could be important, and ‘graceful departure’ plus scheduled rejoining should achieve this.

For example, if I want to restart computer / vault: I announced the desire for departure & anticipated re-join time (in some way that makes sense to the network), so the network figures out what needs to be done before departure, if anything. If I stick to this by restarting only when the network has ‘prepared its self’, and re-join as planned, the network disruption could be small to nothing. Data chains may be needed for this.

If every time a computer is restarted a churn event occurs, it’s a big strain on the network that would be best avoided.

4 Likes

This is the key to stable route to release though. Open debate and detail in the open as opposed to single Engineers coding what they think is Ok to test it. Software is just Engineering and it’s mostly design. So there is high level design then implementation design.

You can imagine it’s like a project, scope of work docs → detailed design docs → implementation detail (RFC’s) docs → implementation → test → release.

So the RFC’s are in this case simply a much more open view of what we are doing and why. Many show improvements as the final bits are put in place. For instance look at what this one solves compared with code required. So assumes data chains are in place. On each churn nodes check the age of each group member compared with how many churn events have happened, if 2^churn has happened then relocate the node.

It’s not a lot of code (disjoint groups was though a huge amount of work) to give us the ability to keep low performing nodes out of the consensus mechanism plus at the same time prevent mass joining attacks and reward longer standing nodes more, just with that small check.

So it may look like we are doing much more work, but in fact it’s not, we are doing much more thinking and finding edge cases before coding, so the coding is quicker and results evaluated accurately.

This project like rust/ember etc. who use an rfc process like this will have a ton of RFC’s continually which continually improve the code base.

Releases do not change though, that happens as per normal, the RFC’s will never be finished, they will constantly be added. This is just code and design improving over time. If an RFC though adds a security level we need or covers a security issue we are exposed to then it becomes more pertinent to look at release schedules with that in mind. After launch that would be a hot update also.

So the RFC’s should not be seen as blocking anything, but clearing up hidden work and exposing algorithms and their reason in a more digestible way than the raw code.

f anything these should make launch faster as we can reason clearer on the limitations/capabilities at any time from a better informed position.

15 Likes

Thank you for taking the time to respond in such details. I did not want to appear like I was questioning the importance of those RFCs, more the order of implementation. I feel silly now to make you loose precious time answering :sweat_smile:

2 Likes

@nowfeelsafer Everything has value, and if you had this concern so will others who haven’t spoken about it. Even if we don’t, it is useful to give David the opportunity to explain and remind us how they work, and why. He doesn’t have to answer, so when he does it shows the value of your concern IMO.

Openness is one of the foundations of this project, one of the things that makes it so strong - in multiple ways. And I think that has to include us being open about our concerns.

7 Likes

I asked this question on the dev-forum yesterday and I got a reply from David.

Is there a max_group time which nodes are allowed in the same group while being allowed to sign messages for quorum?

Answer:

YEs this is a key component here. A node starts at age 1. It stays in that group for 1 churn event. Then it is 2 and stays in next group for 2 churn events. This goes on all the way to an age of 255 (which no node will get to likely). So exponentially over time (defined by churn events) a node is moved from group to group. After age 10 this is every 1024 churn events etc. So age 30 it is there for 1073741824
churn events. if we imagine a churn event every 30 minutes then this is over 60000 years. Hope this helps.

So let’s see how this works out for new nodes:

  • Start your Vault
  • Get accepted in a group and get an address provided (let’s assume “AA01”) by the group with age 1.
  • You need to route data and sign messages but they don’t count for quorum. Your influence is 0.
  • If there’s a churn in a group you get a random new group assigned and need to do some POW to be accepted in the new group. If you do succeed in doing the POW you join the new group with age 2.
  • You need to route and provide data in the new group. But after 2 churn events you are assigned to a new group. You need to do a little POW again to get accepted. If you succeed you move to this new group with an age of 3.
  • After 4 churn events you are relocated again to a new group and get age 4.
  • After 8 churn evens you are relocated again to get age 5.
  • After 16 churn events you are relocated again to get age 6.
  • After 32 churn events you are relocated again to get age 7.
5 Likes