Router connection issues and ways around it without increasing node size to 1 bazillion Gigabyte per node (discussion)

okay … i just wrote this in discord but didn’t think about tagging you @joshuef and jim said the forum might be a better place anyway …

… With the router problematics… Since it’s possible to to use relay-nodes already … (the home networking does it anyway) maybe a solution for reducing the stress on home routers would be:

  1. First Node gets started as regular node (that accepts relay connections from nodes in your local network)
  2. The other nodes get started with parameters to use the first node as gateway to the network (?) The mechanisms should already exist I think… And maybe starting one relay node for every 50 or 100 additional nodes (automatically in the launcher?) would enable people with high bandwidth connect with hundreds or thousands of nodes without issues… While not increasing node size too much to make it impossible to join in for people with little bandwidth available…? (Bearbeitet)

… the trouble with routers is just the number of entries in the table that gets to the 100_000ths or even millions… While the network alltogether is around 50k nodes… And that’s the number of endpoints that max should be connected to…

@peca hinted that the functionality actually isn’t there yet because the relays just coordinate hole punching events … but still maybe going a route with relay nodes might be more beneficial than other means of trying to get around router limitations?

5 Likes

I’m not knowledgeable about networking, but might this create a problem where the close-group of the relaying node ends up getting flooded with too much traffic to relay?

If not, great. If so, it could perhaps still be done, but not as extreme as 100:1. Even 10:1 or 5:1 would have a big impact in boosting the number of nodes home routers can support.

1 Like

That is correct.

Also having one of your nodes as a relay would require the router to do hairpin routing which is not normally the case for most routers.

The other issue is that the number of connections problem is not what a relay node is needed for. Why?

  • the nodes don’t use the relay to connect to peers
  • the relay is to provide a way for incoming unsolicited connections to hole punch through your router (IE make a connection)
    • this is the client contacts the relay and since you have already set up a connection to the relay the relay is able to say that a client at xyz wants to talk to you and same to the client and allows establishment of a connection between the node and client
  • the relay is not saving on the number of connections needed through the router
2 Likes

Really we have reasonable connections with 2GB sized nodes and are getting a grip on what is and what isn’t possible. For many it was just too high expectations on how many nodes they can and should be able to run.

Once people lower their expectations a little then they have a much better and smoother experience.

Increasing the node size will have some negatives being introduced and at some point they exceed any more benefits in increasing the size.

@riddim brings up an excellent negative point for people with less than what many of us are used to. And that is the ability to run nodes at all with the storage capacity and bandwidth available.

  • how much data will be required to go back and forth across their internet connection. Some places in most countries and some whole countries are still on older connections with less than 5 Mbps speeds up.
  • for these lower speeds they can easily run a single (or few) nodes with 2GB. With 5GB nodes they should still survive running a node. But at 20GB, no way
  • the “inrush” of chunks to fill up the nodes with chunks the node became responsible for.
    • 2GB one expects on the order of 1GB per node for a healthy network with a good amount of data
    • 5GB this is 2.5GB
    • 20GB its 10GB
    • the point @riddim makes is how long it takes for a new node to get all those chunks it is responsible for. We must remember that nodes are not necessarily started once for years, but in less privileged places they may get a few days max at a time, or even less than a day. And if it takes too long then its not reasonable to expect the person to run their node. And a 4Mbps bandwidth it will take over 1/2 hour just to get the chunks for one 2GB node that it is responsible for. This is also their internet running at max. @JimCollinson and there is a large portion of the world’s population that in 2024 is still on sub 5Mbps. 2GB is a large node size for these people whom we want to attract for node operators. 5GB would even exclude those who can only run nodes for a day or less. At some point we draw the line, but don’t let 20GB draw the line at cutting off hundred’s of millions of people from operating even one node.
  • yes it will improve over time and @riddim suggested that node doubling every so often using updates can be Autonomi’s “halving” . He suggests start off at 2 or 4GB and double after some years.
  • And of course decentralisation is harmed more and more as the node size gets larger due to less nodes for one and even less nodes due to more and more people unable to run even one node. We cannot see this during beta since most testers are in more privileged regions.
4 Likes

it would if baked into the system a bit more … if it wouldn’t just relay your queries as they are but send the messages to its usual peers “from another node” … probably opens the question how the close group knows where to send those packages because our first node simply doesn’t have the right network address …

…that hole might become a deeper one than i hoped for …

1 Like

I expect decentralisation will be harmed by either nodes being too small, or too large, so it’ll be a balance. I don’t think it’s a simple case of bigger nodes = less decentralisation.

If node size is too small, router issues may leave huge amounts of unutilised space on home devices, forcing much of the network to centralise in data centres & among the few who buy high-end routers.

If node size is too big, as you’ve said, a portion of home users will be excluded from running nodes due to it overwhelming their internet connections, leading to centralisation in high-speed internet areas.

Balancing these should optimise decentralisation, but it may be that the optimal point for decentralisation does exclude those with very slow connections. Thankfully Internet speeds are getting faster rapidly, and will likely continue to do so, but I’m not sure home routers will be optimised for handling sufficiently more connections as storage becomes cheaper / demand for network resources grows.

I hope that software optimisations are possible to reduce router strain, so smaller nodes may become less of an issue.

If the number of connections is still a constraint for a significant portion of home users, in time there is the potential for making Autonomi optimised router devices for home use, which could have a sufficiently beefy processor to handle many connections, and also the ability to add RAM / Storage to run nodes, and possibly specialised processing capacity for home-based AI functions that are envisaged.

3 Likes

No it would not. You’d then be making the relay do silly things. The relay does not touch the connections to peers.

And if changed the relay so it did those connections then you still end up with all those connections. There has to be a way to separate the connections to the peers. And there will not be repeats (except the statistically rare one) of peers amongst 100 nodes. So the relay would be connecting to 100x300 peers with one or more connections. . And to do that would only bog down your relay process making it also into a quazi router with the job of handling all those connections and passing them onto the router to send out to the 100x300 peers.

The relay is making connections and then letting the nodes do their job of talking to the clients. The peer connections do not require a relay and to involve a relay in these will only make everything slower.

2 Likes

And in my other discussions on node size I say nodes need to be as small as practical and so as not hinder the operations of the network.

As you say its a balance and we need to find that which is what the devs are doing right now. But to just go to an arbitrary high number like some want to do will only cause an even bigger problem in reduced decentralisation, the inability for the smaller people to use it and the idea of maximising the participation is lost with whole poorer regions/countries being mostly excluded from running nodes.

So its back to as you say a balance and is what this topic is exploring.

It seems from Jims hmmmm arrr ing in the stage that they want to try 20GB node sizes. Now in my opinion this is too large for the current environment world wide and some will say too small and a much increased groups saying I cannot participate as a node operator any more. The beta will not show this up because most are in the better internet regions with many more technically willing to try.

As @riddim pointed out that for some (himself) people in the discord will find it hard due to the bandwidth and large time a node will take to start when loading its initial chunks it became responsible for.

I hope if they do try 20GB that they will consider trying smaller sizes again, like 4GB

And once the network gains popularity then all these millions and potentially billion+ PC with spare space will mean that large node sizes will not be needed.

Its a matter as well of people setting their expectations to be smaller and consider the world which is not as rich as them. Places where money is scarce and their connection to the world is a phone or small community satellite dish or other connections. We want to maximise the node operators to spread the benefits of Autonomi as far as possible

EDIT: Its my opinion that having node operators in a geo region will also increase the spread of the use as clients or even node operators in those communities as the node operators talk about it to others.

3 Likes

Oh dear Jim is typing away, I am in trouble :face_with_peeking_eye:

There are many facets to this (and I’m not saying I have the answer yet either) but there are constraints on accessibility and usecases of the network due to it’s capacity, and thus price of storage too.

Participation needs to be considered from both the supply side (as we are interogating here) and consumption/demand side too.

5 Likes

Definitely and this is what the balance has to achieve. Can’t say everything all at once :wink:

You are definitely right though. And why i say as small as practical and if 2GB is not practical then we have to go larger. But I cringe when people suggest 100GB and 1TB and some 10TB and some as large as they want (variable size)

EDIT: personally I would have started at 4GB or even 5GB and decided things from there. And yes 20GB would not affect me and make my experience easier. But it may not for everyone

2 Likes

I think the 2GB was just for benchmarking really, always with the expectation that we’d move on from there, and learn as we go.

I do fine the “halving”/doubling event an interesting idea though!

6 Likes

Its going to be necessary as time goes on and internet connections, disk drives keep increasing. So doing it every so often is not only wise but essential in my opinion to keep that balance of the various factors in a good place

And the idea of calling it Autonomi’s doubling will only peak interest in those who hadn’t heard of it after a few years, And doubling that must mean better, double is always better isn’t it

3 Likes

‘Doubling’ sounds good, but perhaps less drastic increases more frequently might be more practical / less disruptive?

Who knows, maybe multiple node sizes on the network will be feasible with future developments, which would be ideal.

1 Like

I haven’t read the whole thread, but it seems like two sides of the same coin.

Either we want it to be hard to run too many nodes behind a router or we want it to be easy. Former leads to more decentralisation, with the latter leading to more storage. They are opposing forces.

Ofc, if the network becomes wildly popular, routers will get upgrades to make it easier to run more nodes again. Perhaps that means it is self defeating?

Feels like 2gb is too small and 100gb is too large. Maybe we just need a sensible number in the middle, that can be tuned in the future without killing the network?

3 Likes