Vault routing burden

The chances of vaults on the same machine being on a single routing path are negligible surely?

4 Likes

Of course. Admittedly that was a small aside in a much bigger topic.

Multiple smaller vaults per machine will open the way for more people to farm (who otherwise couldnā€™t due to routing burden).

Letā€™s double the amount of vaults in the OP scenario to two million. This almost halves (hop count increases slightly as well) the routing burden per vault from about 0.7 MB/s to about 0.36 MB/s. People who could afford the 0.7 MB/s before for one big vault can now (most likely) run two vaults half the size with a routing burden of 0.72 MB/s. The benefit is that people with connections that can only afford between 0.36 MB/s and 0.7 MB/s for the routing burden can now run a vault as well, while they couldnā€™t run any before.

Mitigating the extra earning rate above a certain vault size. It probably needs an approximate measure of the average or median vault size of the network.

2 Likes

Very simple cap would be:

personal farming rate = (general farming rate * your vault rank) / (average network-wide vault rank * 1.1)

This mitigates any benefits from having a vault size over 110% of the network average. Above that 110%, the odds of getting a farm attempt lessen proportionally to the amount of additional chunks your vault has. In other words, you donā€™t earn more by having more chunks.

1 Like

Comprehending Kademlia Routing - A TheoreticalFramework for the Hop Count Distribution

1 Like

2^20 = 1048576 If you exclude half of the netwok with every hop you need 20 hops for a million nodes.
As far as i undestand it there are 32 nodes in a group and if you donā€™t hold the data the request get passsed on the the node that is closest in xor to the data being requested.
Closeness depends on perspective altought you share closeness with alice and alice with bob you donā€™t share closeness with bob.
Excluding the requester there seems to be 30 available pathways for every hop.
Which gives me.
30^4 = 810000
30^5 = 24300000
An average of 4 hops in an network of a million nodes.

1 Like

To circumvent the routing overhead in resending the data many times, a routing node should be introduced to safe network.

It would be a persona that agrees to connect two ip:s together with minimal latency (in exchange for safecoin?).

The normal safe network routing would negotiate the connection, the nodes would ask the network to connect a router node to their ip.

This way the data is only sent&received twice, making video chat etc. possible.

I donā€™t think there are bad security implications, the data is encrypted, so routing nodes would only know 2x ip:s and the amount of information exchanged.

This is true. We did a trial with the NHS in Scotland with self encryption for medical images. It outperfomed Microsoft with 48% savings per disk (not network where where we would be stronger) to Microsoft @ 8%). It was also faster on SSDā€™s where we could get self_encrypt throughput of 1Gb/s on 300Mb/s disks (due to parallel encryption of chunks). So encrypted/obfuscated was faster than plain files in that instance.

We also tested movies on the mini network and it was faster to seek to any part of the move than it was locally (vlc) or certainly seemed/felt that way. Seeking to a part of a Gb movie at the end was very quick indeed. We will see in real life soon though :wink:

17 Likes

Thereā€™s some new info regarding relay nodes and Client Managers in this topic. Iā€™ll quote David below:

You connect to different nodes each time with encrypted streams (to known nodes when you can) after you get on the network. So your ip address is scrubbed, you connect via a random port to a random node(s). I donā€™t see it as a design flaw but as the best that can be done.

Ideally if a group can create their own private key (group key) then we may be able to do more (look at SQRL type key derivation to see how this may do). Then you connect to a relay nodes(s) and encrypt traffic through them to client managers). So it can improve but already seems significantly better that any other system I can find in securely getting on a p2p network.

And hereā€™s some more info:

client ā†’ relay node(s) ā†’ Client Managers (For Put)
client ā†’ relay node(s) ā†’ NameManagers (for Post/Get)

So the relay is almost dumb and does not need to be able to understand what is happening, just pass on ā€œstuffā€ to an address. In terms of what is readable then routing info minus IP and data (which on the network is encrypted) as well as signed requests to alter (Post) or plain Get requests.

The part thatā€™s new is the use of relay node(s) which are used to connect you to your close group of 32 nodes. This means that even the other 31 nodes wonā€™t be able to see your ip:port etc.

5 Likes

I propose to organize some community stress tests once we have the global (test) network up and running. Weā€™ll pick a relatively long HD video, upload it, and then pick a timeframe in which weā€™ll try to get as many people to watch that video in full. Weā€™ll ask everyone to confirm in one thread that theyā€™re participating in the test, so weā€™ll have an estimate on the number of particants involved. Other stats like vault address density will let us estimate the amount of online vaults in the network, and provided that the network is still relatively small, we should notice a measurable increase in bandwidth consumption on our vaults during that time frame.

This could provide very valuable input for the design of SAFEā€™s economic algorithms. If we find that the network is likely going to be very short on bandwidth, we may want to design algorithms in such a way that they dissuade uploads and/or downloads of huge files somehow.

15 Likes