Forward
Discussion on data flows, router buffers issues for home setups using “bring what up have”
@joshuef If you want to check the setting I suggest might solve buffer exhaustions in home routers then skip to the end.
Examination of home setups and “bring what you have” impacts communications in the network.
Introduction
The Autonomi network is at the heart a distributed data and file storage system using the spare storage capacity that people have on their computers. One of the goals has always been to have people use resources that they already own which helps satisfy two other goals.
- an extremely distributed network unseen anywhere to date in the world where people around the world can participate in the network by not only using the network but importantly be able to provide the nodes and storage to bring this about.
- a decentralised network that is owned by no one and everyone on the earth equally. Also one that is not controlled by anyone or organisation, government, corporation yet collectively owned by everyone equally.
Low level communications if not measured can cause effects that appear to have other causes because of such things as packet loss and transfer delays. These may appear in metrics as nodes not being responsive, computers the node is run on being overloaded, the computers CPU usage rises due to errors but appears as overloaded computer.
In this summary of considering low level communications the impacts of using “bring what you have” will have on the quality of low level communications over the network and stability. As such the compute/storage provided will be assumed to be a typical computer from SBC to desktop since their comms will be similar. The examination on communications will look in to such things as
- What elements are people are bringing along in terms of network setup, switches, and routers
- How data moves through the users networking
- Touch on how data will flow through the internet (not done since this discussion is big and with time constraints. It is to do with packet errors causing whole records to be resent)
- How udp will affect how data flows and data sizing effects
Discussion
Background considerations
Autonomi is relying home users to be able to use the hardware they currently have now and not in the future. So while the network software has to be written with the future in mind, it still needs to be able to work in a satisfactory fashion where outages of large sections of the network happen.
Outages could include or be caused by but not be limited to things like, earthquakes, large power outages, fibre cables being cut/damaged (anchor, exit points damage, acts of war, maintenance, and so on), political actions, border router bugs/misconfiguration, solar.
For the network to survive it needs to be stable, not lose data, “feel safe/secure to users”, not so slow people stop using it, and does not cost an excessive amount, and people have easy ways to buy and use tokens.
This summary of low level communications is not concerned with most of the requirements for a network that will be adopted by the world. It is concerned with low level communications which has consequences around the stability of the network and affects communications speed. Speed of record flow has a number of components and low level communications is just some of the factors affecting record transfer feeds. This examination is not considering the processing time within the user’s nodes nor remote nodes, but does consider the communications.
Network elements
Home network had started off with the LAN being 10Mbps, then 100Mbps, and now 1Gbps for most of the home LANs. Slower speeds are rare except for WiFi connections in the home. WiFi being slower and closer to the household’s internet connection and may actually benefit the household’s node stability because of the effects that will be outlined later.
For this examination the typical household situation with computers being connected to one or more switches and a ISP supplied router with some notes on WiFi and mobile devices. This is not meant to be an exhaustive discussion but to bring understanding of the underlying communications with the view of good decisions being made for parameters affecting the communications in Autonomi.
The following descriptions are not meant to be detailed and meant to provide an overview of the device’s purpose.
Hubs (L1 device)
These are basically just connecting the network twisted pairs together electrically without any buffering or flow control. This is used in LANs. Cannot control flow so that packets do not overlap. All computers receive all packets sent by other computers connected to the hub
Hopefully no one is using these devices because switches provide much better connectivity and very inexpensive to make hubs inappropriate
Switches (L2 device)
These allow multiple computers to be connected together without packets clashing causing errors. Used in LANs (Local Area Networks)
The switch has memory buffers, uses the MAC address to do port to port switching. The advantage of using the MAC address is that packets are only sent to the device that is the packet’s destination. For switches used for the typical home network the ports are all the same speed and the memory buffer does not need to be too large.
Routers (L3 device)
These devices connect a LAN (Local Area Network) to WAN (Wide Area Network). When a packet is destined for a device not on the LAN the router will send the packet over the WAN connection (internet connection typically)
Most homes will only have one LAN, but some may have multiple LANs with one or more routers handling transfer of packets between LANs and the WAN(s)
The router is a much slower (relatively) than a switch since routing is a lot more expensive in processing and often done by a Microprocessor rather than very fast switching chip. Routers all have memory buffers to queue up packets to/from the WAN and for holding its NAT table.
Routers are used throughout the internet and corporate networks. These are much more powerful than any home routers. Although the need for large memory buffers really depends on the networks being connected and the speeds of the ports connecting to the LANs. For instance a 1Gbps to 1Gbps connection does not need as large a buffer as a 10Gbps LAN to a 100Mbps WAN (or LAN)
ISP & home routers (L3 & L2)
These are routers with a switch in the same unit. The switch will be handled by a specialised switch chip with its own internal small buffer memory. Any packets destined for the internet will be buffered/queued in the router’s memory buffers and sent in turn. Packets received from the internet are stored in a buffer and using it’s NAT table the packet is either rejected or passed onto the destination device on the LAN
Typically the home LAN is 1Gbps. And the WAN (internet connection) is 40Mbps or slower.
When the average upload speed obtained from the speed test sites it is a pure average. This means because the 1Gbps & higher upload rate that is possible in various selected regions of the world, the average is skewed upwards which is shown by the relatively slow average of 48Mbs.
For this discussion the uplink speed of 40Mbps will be used and must be remembered that most average home internet connections will be this and below.
TCP & UDP
TCP is not used for Autonomi at this time. The notable property of this protocol is the error detection and built in flow control because each packet (block) is acknowledged as OK or NOK.
UDP is a protocol that is faster than TCP, but does not have error detection or acknowledgement as part of its protocol. Also correct packet ordering is not guaranteed through routers or routes used.
www.cloudflare.com/en-gb/learning/ddos/glossary/user-datagram-protocol-udp/
QUIC
This is used by libp2p which Autonomi uses. libp2p is a library.
There is flow control within QUIC and max data block size is set to 10MB
// Ensure that one stream is not consuming the whole connection.
max_stream_data: 10_000_000,
Basic data flow
Data flows into and out of the device the node is running on. The node receives data in the form of messages and records. Messages are small and form part of the protocol that makes Autonomi work as a distributed data store. These can be operational messages and responses or they can be requests for a record to be sent and sending a request for a record to be sent to the node. For this examination the operational messages can essentially be ignored since they are small and do not significantly cause communications issues. Records on the other hand are large in comparison and exercise the network infrastructure to a greater extent.
When a record is sent to the node it will arrive at the router which will buffer the packets making up the record, then checking the destination of each packet and forward it to the device the node is running on. Because the typical home network the WAN (internet) connection is significantly slower than the internal LAN, which means in normal operation the packet is forwarded without much delays. IE buffer queue will remain small.
When a record is being sent from the node the record will be broken down into network packets and forwarded through one or more switches to the router which will buffer the packets while sending other packets from the buffer queue. These other packets may or may not be the earlier packets from that data block (record). The node will send up to the “max_stream_data” before waiting for handshake back from the receiving node. This is currently set to 10MB
Since Autonomi is using UDP the packets will be sent rapid fire from the device to the router as fast as the LAN networking will transfer the packets. Typically this is 1Gbps. The router will forward on the packets across the WAN (internet) at 40Mbps or lower for the majority of home connections. If the router is not sending any other packets then the buffer is filling up at approx 120MBytes per second and emptying at approx 5MBytes per second. Thus for 1.2MB of data it takes approx 0.01 seconds to transfer the data and takes 0.24 seconds to forward the data block to the internet.
Now with QUIC there is the possibility to do flow control on the data being uploaded to another node. Since the value is 10MB currently there would be up to 10MB sent in one go.
Single node considerations
For one node then records will be sent across the device’s network to the router and the above considerations on buffering will apply. One or multiple records can be requested at the “same” time. This can occur if a new node appears in the network close enough to records held by the operator’s node. This means that the router’s buffer will have to handle approx twice the data.
Multiple node considerations
The requests for records will be happening for multiple nodes now in an independent manner. This may or may not result in there being multiple records being sent to the router for forwarding to other nodes. When there is little load on the network then this will not happen often unless there is a influx of many new nodes and more than one is close to some nodes on the operator’s device.
Multiple nodes on multiple devices on the one LAN
The switch now may have to buffer packets being sent to the router section since the router is only one port off the switch (internal or not). For a lot of packets being sent by multiple devices at full speed will require buffering. This is because the router has no way to flow control the UDP packets, whereas TCP can due to the TCP protocol.
Review of basic data flow.
Autonomi uses UDP as the low protocol with QUIC on top. Due to UDP having no flow control itself the packets from a record request up to “max_stream_data” will be send at full LAN speeds to the switch onto the router. If the switch is also switching packets to the router from other devices then buffer space in the switch will be used. The router will buffer these packets while it is forwarding the packets via the internet to the other nodes.
All is fine if there is no buffer filling up.
Buffer sizing vs record sizing
The basic data flow discussion showed that routers (& switches) have buffers to queue up packets to be sent, the buffers are both essential and can be in certain circumstance exhausted at which time the only remedy is for packets to be dropped.
For Autonomi one of the basic parameters for storage is the Max Record (chunk) size. For any file being uploaded or record the maximum parcel size of data in the record is the Max record size. Many records may be lower than this.
Thus when file is requested there will be a minimum of 3 nodes supplying the various records (chunks) making up the file. This is due to SE (Self Encryption) having a minimum of 3 chunks and each chunk has a maximum size of Max Record Size.
From this the average number of records in any one node on the network will be a function of the total data and Max Record Size. It is not an exact function due to the minimum of 3 chunks per file and so there will always be more records stored than total data stored divided by max chunk size.
The significance of this is the number of nodes sending data when a given file is downloaded is higher (or same for small file) when the max record size is smaller. The follow on effect is that for each home node the buffers will fill less than for larger max record sizes.
How can max record size cause issues with home networks.
When more records are requested from one or multiple nodes on the LAN then the router has memory buffers for. When there is no more buffer space available then being UDP the following packets will be dropped.
What is the buffer size in a router?
This depends on the router model and purpose of a router. Here is an example of 2 routers with different purposes. Mikrotik routers were chosen because they actually provide the amount of RAM. This RAM is split up into send & receive buffers, indexes, NAT table, operational variables and any other requirements
(Semi) Infrastructure router
From the specifications the RAM is 64MB.
For high speed in and high speed out the buffering will always be minimal and thus the RAM requirements is not as high as for other applications where there is a much greater mismatch between LAN & WAN
Specialised “home lab” type of router
Here the RAM is much higher to allow large buffers and NAT tables. It was specially made for many types of operations.
ISP supplied router
No specs since there are 100’s of possible routers ranging in RAM size of 16MB to 64MB
This is one reason why their NAT tables are relatively small and connection wise only supports 5 to maybe 40 or 50 nodes.
From the discussions above it can be seen that the limited buffer size on most ISP supplied routers will also be a limiting factor on the number of nodes that can be run.
What does this all mean
This discussion has deliberately not gone into too much depth and details, but provides the brief that provides enough details to examine the what is happening on home networks when max record size is changed. This is not meant to provide a comprehensive set of reason for a recent collapse of the network, but to provide some tools for developers to use to work out this one area of sizing when planning for the current home internet setup. One that will persist for years to come in the average home.
Home ISP supplied routers are woeful on their internal specifications with such things as
- routing throughput “pps” (packets per second) and data (Mbps)
- RAM 16MB to 64MB with most not at 64MB
- NAT table space
- send & receive buffers
- etc
For most home situations the overall data flow is receive which requires little buffer space. And upload is rarely used in comparison for anything more than requesting download data/packets and occasionally a 5Mbps type of video “call”. With the internet connection speeds much lower than LAN speeds then no significant buffering for receiving packets and if upload is above 8 Mbps then no buffering for those those video chat sessions.
This is why ISPs get away with supplying these low spec routers. For those with fibre then typically the router has better specs. In these cases though with the WAN (internet) connection being on par with the LAN speeds then router buffing will be very minimal as well, thus the ISPs get away with relatively low specs for a premium internet connection.
In all case though the users who have reported back all show that their ISP supply router had issues with the NAT table size being unable to support enough connections for more than about 20 to 50 nodes.
From this discussion it can be seen that for ISP supplied routers the record size can also cause a limiting factor on the number of nodes possible and may explain why 4MB max chunk size prevented many who could run 5 nodes not being able to reliably run even 1 node continuously and mobile connections not at all.
So how can a good size for max record be determined?
It depends on the target makeup of the network desired.
If a network with nodes mainly in professional locations such as data centres or large businesses with premium networking/internet then larger the max chunk size the better since the switches and routers will have the RAM due to the network infrastructure having WAN speeds comparable with any LAN speeds.
If a network with nodes mainly in the homes with average ISP supplied router/switch then a smaller max chunk size that will not overflow the internal router buffers whenever more than a couple of chunks are being uploaded.
What is a size that is suitable for maximising home nodes, well from the test networks 1/2MB worked well for the loads being placed on the network. This is because compared to the 4MB max chunk size there is 1/8 the buffer needed for the same number of chunks, and this also means that coinciding chunk uploads through the one router is less likely since a chunk is fully uploaded 8 times quicker. This represents up to 1/64th the buffer load on the home router
Conclusions
Data flow through the home network to the internet (upload) relies heavily on the buffer supplied in the router due to using UDP and the router not being able to regulate that flow and dropping packets if the buffer space is exhausted.
Multiple nodes uploading chunks at the approx same time will use more of that buffer space. The larger the chunk the more buffer space required.
A typical ISP router with 16MB to 64MB memory will not be allocating all that memory to buffer space since the RAM has many tables and buffers required to live in the RAM and since most ISP routers will not have a upmarket specs and likely only have 16 or 32MB RAM the buffer space could be as low as 6MB on a router with 16MB and 12MB on a router with 32MB, its pretty obvious that any network with chunk sizes above 2MB will max out many home routers once 2 chunks have to be uploaded at the same time. And for better ISP router maybe 4 chunks or even 6 chunks at the same time.
Thus it can be seen that 4MB max chunk sizing will exclude many home routers.
Addendum and solution if I read the QUIC code correctly
From the discussions it can be noted that the issue was flow control on the record data blocks being sent over UDP. If there was a way to tell nodes to only send the record in small amounts then the buffer issue reduces to a manageable amount.
solution?
As mentioned in the write up there is a QUIC constant that controls that maximum amount of data.
// Ensure that one stream is not consuming the whole connection.
max_stream_data: 10_000_000,
Just set it to 500KB or even 250KB for potato routers and 4 or even 8MB max chunk size should be fine not causing routers to exhaust buffers with higher activity like replication & churning.
Being at the default of 10MB is the problem