Routers for advanced setups

Happy to share my few weeks worth. I would just caution for investing too much at once and expecting a smooth ride. There’s a lot of risk. My ISP allows many small incremental upgrades within weeks and that has saved me a lot of money.

The UDM PRO MAX is better than the UDM PRO (importantly supporting NAT table max and higher throughput), but not worth the money for applications such as autonomi. Where good connectivity is important you need to co-locate or get true enterprise grade internet, which is expensive, or use cloud/VPS but be ready to have to work around other limitations there.

I have learned a lot about networking in the last few weeks, and as a result I am moving my VPS applications on-prem (leaving some for resiliency in the cloud), and autonomi off into the cloud. My conclusion is that it’s not worth it degrading an otherwise great home internet connection, and otherwise great gateway with excellent management capability and GUI, with p2p traffic (unless it’s limited to 100-200 nodes or so if you have symmetric broadband, or just 5-10 nodes if you have cable!)

A summary of my regrets people can learn from :slight_smile: :

  • My UDM PRO MAX purchase because it’s still just a prosumer-grade product. I wish I could justify spending multiple thousands of dollars on proper hardware but I fear that I would still be running into issues using lower-cost but carrier-grade equipment. Using servers as routers did not deliver either, after all the trouble to get port forwarding and hairpin NAT configured correctly. i have tested nftables/firewalld and VyOS, and pfSense only somewhat.
  • Upgrading from 500/500 to 1000/1000 because the higher the rate, the more difficult it gets to utilize the maximum, sustained. This will differ from provider to provider though so you might get lucky.
  • Maxing out some of my servers at home, because RAM and storage are not the limiting factors (yet) and there is no good way to get around bad ISP even if you get your router working properly.
  • Upgrading UPSes for the same reasons, the cloud is quite cheap considering that there are no power outages there.
  • Renting a larger VPS because of hitting traffic limits (per IP I suspect) and not being able to have those limits raised for VPSes. (Dedicated would supposedly be ok, but even more expensive.)

With the UDM MAX I got the best earnings using SmartQueues on (greatly helps stability even though “it is not recommended beyond 300/300”), upnp enabled on the router, and safenodes started with --upnp flag by the way, which was surprising.

So there were issues with both ISP AND with the gateway routers. While I would be ready to explore pfSense or VyOS routing options further, I am going to take a break because of my concerns about ISP’s performance.
Connecting a huge server with a few TB RAM that can run 1000 nodes or so directly to the ISP is the best way to troubleshoot (taking routing and NAT out of the equation). That way you can argue with them. You will then also have to run your own tests to measure packet loss between a server in the cloud (or better, a server at a friend’s home in a different location but connected by the same ISP), and your server at home while running 1000 nodes. I have not done that testing.

I suspect that packet loss (or latency or latency peaks maybe) is reducing the odds of receiving and responding to paid PUT opportunities: I have seen nodes that had sufficient peers and appeared healthy, but that earned substantially less than nodes on a “crappy” VPS.

A feature request to autonomi: A way to somehow measure and report packet loss and latency between peers and my safenode, to determine connectivity health more easily.

5 Likes

@Josh I know you’re having issues with ISP. Here we have a broadband forum that has been running since the first broadband internet was being installed. Its a huge forum with a huge amount of knowledge contained on every ISP in Australia. The ISPs have had to be slightly better than average tech support because of that since fact checking is a few mouse clicks away. But yea its still shit for the most part. Occasionally I’ll get an eager beaver who actually realises they are talking to a knowledgable person and works with me. But most have to run through their script of reset blah blah before even listening to the question. The reset and power off part is in their hold music/talkies.

Have to let the ISP play their game of following scripts and bite ones tongue at times. Just once they were following script and found a faulty connector on the street pole that was causing intermittent problems. If they are using multi-wavelength fibre then connection issues can cause issues with one/more wavelengths and not others. But at least if they can tick off their script then they have to fix their equipment and means they justify the cost to the bean counters, the almighty dollar/shareholders

Anyhow I have been doing a lot of local network testing and decided last night to spin up more nodes from the 30 I have been running behind my router. So I have gotten to 85 nodes now and youtube is still butter smooth. I am stoked to say the least, feels so good after the old router.

3 Likes

Thank you for such a detailed answer!

My ISP and how they handle things is completely unknown to me. Frontier is brand new in my area, but my area is also expanding very quickly, so hopefully they didn’t cheap out on the hardware. But going from coax to fiber is going to be a huge improvement in my upload regardless of whether I can run the number of nodes the speed claims it can run or not.

I will also be learning a bit about networking in the coming weeks, as my network is going to need a bit of a restructure to move around my previously gig-only network to now make use of any hardware that is currently 2.5g or 10g enabled.

I do have some questions about your “regrets” list if I may.

Do you regret the UDM pro max because it isn’t the enterprise level gear your thought it would/should be? Or because the product itself isn’t all that great? I was heavily leaning towards this product for its claimed ability to handle IDS/IPS at 5gig (which I now have the option to get, and want to future proof myself. Buy once cry once) I really did like the IDS/IPS ability and reports on my current hardware, but it was simply too slow to handle even my 500 down.

I have plenty of non-mission-critical loads I can move from VPS to on-prem and replace the VPS capacity with nodes. 2gig may be overkill, but it’s at the same price point as my current coax connection. With you speed upgrade regret, are you saying you didn’t see much of a difference? Or just that because of what you were trying to do, you don’t feel you were able to make use of the expanded capacity?

My “regrets” are with the application of trying to run 1000+ nodes from home in mind. The UDM PRO MAX will not work, and as for the ISP, anything over 500/500 may disappoint, There may be ways to utilize the bandwidth with better router or no router, correct router setup, and ISP working with you but it could be a lot of work. I hope someone will post a working complete setup, because I have used up my time budget trying.
The UDM is not enterprise-grade and I knew that beforehand. The UDM MAX supports maybe double the autonomi traffic that the UDM PRO supports, justifying the higher price somewhat, but I needed far more than just a factor 2x to utilize my 1Gbps service.
If your ISP is new in the area you have good chances to be successful because they could be using new hardware, and knowledgeable people could be still be around to troubleshoot their new network. And the fewer customers they have, the better.

In any case, for both router and ISP it’s important to realize that running 1, 2 or 5 Gbps for general use is very different than saturating that bandwidth with hundreds of thousands or millions of connections. The UDM MAX for example is supposed to support 5 Gbps, but that’s only for certain kinds of traffic I suppose. Same things with the ISP. I ran a speed test, easily hitting 1Gbps in both directions.

Generally, however, the UDM PRO is great, and the MAX is somewhat faster with plenty of memory. I am keeping it as a fail-over in case my older UDM PRO dies. Before this, I was considering the backup power unit, but the price of it just doesn’t make sense.
Coax and Fiber ISPs seem to price their services based on download speed only. As a result, coax seems ridiculously over-priced because the upload speed is typically just 1/10th.
After the beta my current setup will happily support a few 100s of nodes without disrupting my other services. I will probably be okay to downgrade my broadband service to 500/500 or even half that.

Thank you autonomi for “justifying” :wink: my impulse purchases and upgrades, and hopefully I will cover some of the cost after launch :slight_smile: .

2 Likes

Some more info about UDM PRO MAX conntrack entry count vs total memory without running any other UDM applications other than networking: 1M 3GB, 3M 4GB, 4M 4.37GB, and for 7M entries 5.54GB RAM. This tells you what the potential value is of that additional RAM of the MAX in terms of max table size.

This was a best-case scenario after increasing the conntrack_max and hashsize beyond factory default of course, and preventing the router from “crashing” (loosing all NAT table entries, not actually crashing the router) by tuning of the settings including the udp timeouts, and not overloading the router with too much traffic, roughly staying in the 300 Mbps range.

I have the impression that a router NAT table crash is inevitable no matter what you do, and that the best thing that can be done is maximizing the duration between such crashes. Setting large udp timeouts seems to help with that.

The health of the router can be assessed most directly by monitoring the nf_conntrack_count value through the root ssh shell. A “crash” results latency spikes beyond 100ms, a drop in traffic, memory usage and CPU usage, which do shows up in the GUI.

One more thing, from experience, when you try to increase number of nodes forget about using your internet connection for tcp, dns, safeup updates and installing packages. It becomes a udp-only network for practical purposes, probably due to your WAN IP’s queues in ISP’s equipment getting clogged with udp packets :slight_smile: . Or I could just be an unfortunate outlier. Good luck with testing and pushing the limits.

3 Likes

I wouldn’t use that as general rule. If the ISP has 500 customers and you are responsible for half of the traffic, you will stick out like a sore thumb, if you are 1 (or one of few) in million, they may be lazy to investigate why you cause so much load on their HW.

It is always best case scenario.

It is based on HW architecture of “last mile” connection, on GPON (fiber), coax and most wireless systems download is “cheaper” than upload.

You are unfortunate, this should not happen. My guess is they are running some CG-NAT ugliness with not enough capacity.

Thanks for the feedback. The price ISPs change for coax based service is the same as for fiber, which does not make any sense technically but they are getting away with it because their customers don’t care about upload speed. I don’t think I am behind CG-NAT. But I’m going on vacation too now, to get a lot of rest and be ready to bully my ISPs into submission after I return.

1 Like

Thanks for the detailed info and settings to start looking into. Equipment should be here by the end of the month and hopefully I’ll have everything setup before the new service gets here. I don’t know if they’ll let me say “hey, you can put your supplied router in the corner over there as a failsafe, but hook it up to the UDM” Or if they’re going to make me use theirs and I’ll have to switch it over when they leave. Regardless, it’s going through the UDM.

I’m not looking to run thousands of nodes, but a couple hundred from home would be nice. I’ve got the hardware for that without getting anything new.

I’ve always wondered about this, and figured it was a physical limitation of coax (the only type of internet I’ve had since dial up days). Is this cheaper in a dollars and cents / peering costs sense? Or cheaper to setup / maintain? Or something totally different?

The difference is point-to-point links versus point-to-miltipoint on the lowest layer. Today’s ethernet is 1:1 on the lowest level, 4 wires one way 4 wires back, no sharing of wires or with other devices.

Typical coax network segment is “shared medium” with tree topology, one master and many clients. There is only one wire to the master, so one available frequency range to communicate and you have to divide that between download and upload.
Even if you set the ratio 1:1 you have more download, because download you just send to the media and clients pick their packets from the broadcast. On upload you have to prevent collisions of multiple clients trying to upload at the same time. Typically you ask clients who has data to send and then send them timetable who sends data when. This overhead cuts significantly to your upload capacity.

Fiber is same thing when GPON or other point-to-multipoint technology is used. The limitation comes from topology, not the type of medium.

2 Likes

The other issue with cable is uploads are using a low frequency bands of the broadband cable. So one channel for upload has a lot less capacity then the channel being used for downloads. This is a design feature they implemented since the early days of cable where users only downloaded and upload was only there to facilitate requests for downloads (handshakes included)

IF I remember the specs the uploads are TDM on the channel to prevent errant upstream packets. And now the downloads are using multiple (~40mbps or ~30mbps) channels exclusively for each user. Thus there is a limit on the number of users on each segment from the node. That is why you see in some places multiple cables together, usually closer to the node as the cables fan out.

Its possible in Australia when the NBN took over the coax (HFC) that each user also has exclusive upload channels of about 7mbps each. All that is tucked away in the termination box rather than in a old style modem/router here.

2 Likes

@neo I’ve got my RB5009UG+S+IN on the way. I’m probably going to need some guidance on optimizing my setup. Looking forward to smooth managed internet service.

$205 with tax and shipping. New sealed in original packaging.

2 Likes

I redid mine a few times due to mistakes LOL. Its not that hard really. For me I did not have to worry about PPPoE or other such settings. So I may not be that great a help if you need to look after that stuff.

The guide for setting it up from Microtik had most of it in there. I did a tad extra by moving WAN to port 8 so I had the 2.5Gb port for my LAN. Only have 1Gb on my HFC termination box. Some rules had port 1 instead of WAN so I went through the default rules in the firewall to fix that.

The other is port forwarding not in the guide and google only gave one correct result, the others where out of the butt crap. Its real easy though, the hard part is doing it for multiple devices and making sure you get the right ports for the right device & rule. Then its setting up static DHCP addressing for those devices.

Anyhow ask away when you’re ready.

1 Like

@Josh Well it seems my internet is workable with 150 nodes, but anything that requires uploading anything more than simple requests (like a grocery shopping page full of images) does not always work in a timely fashion. But it isn’t like dropped connections style one gets with a ISP router. More like timeout on the server end because the packets took too long (elapsed time) to get to the server.

Obviously I was overloading my uplink bandwidth wise. 45Mbps on a nominal 40Mbps uplink.
Had reduced to 120 nodes previously then reducing to 90 and all good with the grocery shopping. Ironic that my son gamed without issue when I had 150 nodes. Mind you the game wasn’t super demanding game like a few are, but still a online real time game. Youtube with its buffering was fine with 150 nodes.

When the team optimise the messaging and node size it’ll become interesting. A lot of bandwidth in that messaging when no chunk traffic is occurring.

2 Likes

I was so close to having a tech out today who would have investigated my issue with the IP.

Appointment at 8am this morning but it went away at 1am last night.
I know the issue is not here but the fact he has direct access to folk I can’t speak to was having me hopeful of getting answers.

Anyway all going well again, can’t complain about that. :crossed_fingers:

1 Like

Well if continues to go well then they found the fault at their end and cancelled the tech visit

Final update on the UDM PRO MAX:
With about 400 nodes of traffic, with huge values for maximum table and hashtable sizes (32M), and very long udp timeouts to ensure that the NAT table does not need to be cleaned out during this happy time of stability, the 8 GB of RAM can be filled with 13M entries over the course of about 10 hours. The CPU just can’t handle both traffic (around 250 Mbps in this experiment) and cleaning out the table as entries expire.
When the physical RAM limit is hit, the table will crash and the interface will loose the WAN IP address for a few minutes leading to a complete outage. At least this router can recover from that without reboot. Preemptively flushing the table conntrack -F can prevent the loss of WAN IP address, but still, the latency typically keeps swinging from 10s of ms to seconds every 10 seconds or so. The limited power of the CPUs is really the limit here.

2 Likes

Here is a results after a few days… More nodes seem to be actively running which is cool… seems stable overall on the network.

I guess I have a few general questions though. Im sure the info is somewhere but unsure how to dig it out specifically.

Earnings: What fraction of a coin is each nano? 100th? 10,000th?
Records: What size is a record? is it a fixed size?
Are PUTS and GETS a fixed size?

…Why the hell do I have so many errors lol.

1 Like

Not of a coin, but a token. There is a significant difference

A nano is a SI unit.
Milli 1/1000th One thousandth
Micro 1/1,000,000 One Millionth
Nano 1/1,000,000,000 One Billionth

So a “nano” has just borrowed from SI units. It doesn’t need to be specific to any coin or token. So when the Autonomi token gets a name then the nanos will still be the same and I suspect the name has stuck and will continue

Fixed as in its a limit figure. They can be anywhere up to that size. It is 1/2 MB max

And associated to that is the max number of records in a node is fixed for any implementation. It is 4096

The team will be, I am sure, changing the figures in a test net or a beta net so that the max size of a node will be bigger than the 2GB max size it is now.

The size is whatever the size of the record is being GET or PUT

Errors range from a packet transmission error to real problems. UDP is not as secure a transport layer as tcp as it doesn’t have the error retrying that tcp has. tcp masks a lot of errors that we’d see in udp. So when I say its not as secure I mean it doesn’t have its own built in error recovery that tcp has. So an error in transmission means that the code sees an error and does its own brand of retrying.

Also the home-network flag has even more errors due to the relaying layer to establish connections.

The team are working on reducing the errors and mentioned one fix they are going to try in the latest update. The good thing is that the nodes recover from all theses transport layer errors

2 Likes


If anybody is interested, this is probably real life performance limit for node traffic on Mikrotik hAP ax2. I believe my firewall and NAT config is very streamlined, it is just the packet load.
image image
Good thing is that even at his full load, or you can say overload, the router is perfectly stable and I can browse internet like nothing is happening. Latency is unaffected, still getting around 5 ms to google.com.

EDIT: Right now CPU load is 100% flat line, but everything works and router is pushing
400k pps.

2 Likes

The first image shows that the average packet size being transmitted is around 324 bytes and this includes the header data which is a significant amount of those 324 bytes. And around 293 bytes packet size for received packets.

That means its all down to packets per second. And illustrates that nodes certainly do a lot of small packets compared to full sized packets sending/receiving chunks

2 Likes