Routers for advanced setups

Southside · October 21, 2024, 7:52pm

So if I was to write a guide to setting up Formaiao from @bochaco, I should also include the steps needed for k8s for setups with more than say 20 nodes?
Looks like time for @Southside to refresh his k8s knowledge.

riddim · October 21, 2024, 7:53pm

I think anything up to to 100 nodes or so should be okayish (maybe even 200 or 300…but when we approach the thousand it for sure should be troublesome) … And that’s a lot of nodes for local…

Shu · October 22, 2024, 12:11am

No worries. I got my infrastructure at home quite streamlined and automated over many years off hard work and input on that front. I am using LXCs (nothing to do with docker). I don’t worry about SSD wear out as all of its on rotational hard drives in a giant Ceph FS cluster. I don’t worry about logs either as those are all going to a separate temporary file ram system per LXC and used by all safenodes inside of the LXC with a cron job to trim it as well. This reduces CPU cycles and disk i/o significantly as well.

Everything is automated to spin up multiple LXCs per physical hosts with configurable initial input settings for # of safenode per LXC, safenode port range, safenode rpc range, safenode metrics port range etc, and it auto sets up the NAT port forwarding with NAT Reflection on the router.

Only thing stopping me from truly running everything to the max is high power requirements for all the hosts today, so still an issue I need to deal with the electrician lol.

The other locations where I have safenodes running, I pass through all the network traffic off all safenodes’ via IPSEC tunnel to my main location, and then fan out to the world, and also in reverse. This saves the overload off limited “router nat session table” problems on other locations depending on the modem and the router at play at remote locations, since I am not currently bottlenecked on NAT session table entries at my primary location, or so I think (at least not aware off it yet) (haven’t pushed it > 1M+ session tables yet).

Erik_Hedin · October 22, 2024, 7:14am

Hello,
What router would you recomend to upgrade to?
currently using a 7 year old tilgin from my ISP. we have problem and delays loading webpages eaven without running nodes.

My IPS offer Dual-Band Wireless AX6000 Gigabit Ethernet Gateway - EX3600-T0 | Global | Zyxel Service Providers for just under 100€ They can give limited support (remote login/reset/check).

are there better routers out there? usecase is home stuff and running safe and BTC nodes.

drirmbda · October 23, 2024, 2:22am

1M+ NAT table size sounds small for a setup limited by circuit breaker capacity. And in my experience the real key metric is the number of connections per IP address. So are you tunneling multiple groups of connections to multiple machines in the cloud to reduce number of connections per IP?

Second, have you measured UDP packet loss between nodes/UDP ports in your setup and between a home node and one in the cloud? My measurements were quite disappointing and led me to run fewer nodes per IP. (Residential fiber leasing multiple dynamic IPs

Shu · October 23, 2024, 3:05am

I am not sure I am following you on the above. Any unique combination of src port/ip & destination port/ip is in router’s memory. This is what I am watching as this is what breaks down on consumer grade routers (past say 8192 or 16384 active state) (most of the generic ones at least).

I am not using any cloud provider or anything to do with them. I am tunneling a group of remote machines at site A/B/C through a dedicated IPsec between site A/B/C <= (fiber connection) => D, and then outbound to the internet, while site D’s machines also goes outbound via site D’s router.

I don’t have much power capacity here, and my machines are old used hardware (not the latest manufacturing process in terms of nanometers), hence they do draw decent amount of power.

drirmbda · October 23, 2024, 5:25am

I see. Thank you for sharing by the way, (and I am hoping for more once the competitions are finally over and focus shifts to increasing total number of network nodes!)

I’m using 2012 server tech and have unfortunately no reason to oversubscribe anything whatsoever purely because of my ISP’s limitations. I hit a wall there and still hoping to find a way around it one day, other than going business grade or enterprise fiber.

So when you mentioned fanout I thought of it as reducing the number of states in the home router by pulling external public IPs into servers at home through a tunnel. (I have tried this using WireGuard and a second IP on a VPS in the cloud but latency was too high in my situation.) This reduces the NAT table at home.

You appear to be doing the opposite, aggregating connections from multiple sites and managing their state in a single NAT table. You may be more fortunate with your ISP and if you stay away from Linux based routing you should easily be able to deal with 16-32M NAT entries.

chrisfostertv · October 23, 2024, 8:30am

So your using Alpine as the OS and ‘OpenTofu’ (fork of terraform) to create LXC containers?

Couldn’t find any info on utilizing Terraform for bare metal management…seems orientated to ‘cloud’

Shu · October 23, 2024, 10:38am

Yes, I have power constraints and that’s likely why I can’t quite run as many nodes as my hardware’s true capacity allows, otherwise the main router is configured for about 32M max NAT entries at the moment, though I can increase that even more. I suspect the only thing ISP is doing packet forwarding on their switches once it leaves my router outbound after the initial hop to their gateway. I am not using any of their equipment (at least not from my home).

So your using Alpine as the OS and ‘OpenTofu’ (fork of terraform) to create LXC containers?

Alpine image for the LXC.
Using terraform with the appropriate provider against all my physical hosts (they are all hypervisors), none run a bare metal OS without a hypervisor or virtualization functionality.

chrisfostertv · October 24, 2024, 12:58am

Please advise the providers that you use?

Are you able to utilize the Terraform free tier?

Shu · October 24, 2024, 2:02am

I don’t recall what terraform binary version I am using at the moment at home. It’s a simple binary to download (standalone), and use as is. Never had to pay for anything or use their enterprise level features.

Terraform has plenty of backend providers: (xen, esxi, azure, aws, gcp, proxmox, etc), the list is endless.

Shu · October 27, 2024, 2:10am

Managed to find an extra source of power at home, so I decided to ramp up the safenodes’ on one of my servers that was under utilized due to power constraints. Also, took the liberty to fix a glitch in my LACP setup for that server specifically, so now it has full access to 5Gbps of outbound/inbound bandwidth from the router.

After the above, started ramping up safenodes, and it managed to generate enough connections that the OS (hypervisor) was outputting:

nf_conntrack: table full, dropping packet

I think the default might be 65,536, so I upped it to 2^20 = 1048576 .

It should help keep the state of the safenodes much healthier .
Router is showing > 1.25 Gbps in steady state now, and over 1,000,000+ state tables size now.

Still ramping up (before an incoming power meltdown via a circuit breaker tripping), , .

neo · October 27, 2024, 2:19am

Have you enquired about getting a temp service added to your place? You know the kind they put up for a construction site. Or are you in a multi-dwelling place and just cannot do it. Or maybe setup solar panels and powerwall to put some devices on?

Shu · October 27, 2024, 2:20am

Wherever I am located, I don’t have many options here, maybe none. Will be inquiring and contacting a electrician for sure, and more, as and when time frees up.

VaCrunch · October 27, 2024, 2:21am

What kind of router is this that allows the user to increase the size of the connections table?

neo · October 27, 2024, 2:21am

Feed in service line can often be the limiting factor for power

Shu · October 27, 2024, 2:24am

Its software based, I don’t like vendor lock in and I don’t use enterprise or consumer grade routers that you buy off the shelf etc.

Its my custom built server with my choice on the hardware specifications along with pfSense running on top of it plus a bunch off additional FreeBSD related tweaks. Session table size is only limited by the amount of RAM the server has. I am not sure off an upper limit here yet, but I only provided it 32GB of RAM for now.

VaCrunch · October 27, 2024, 2:48am

Interesting. So, do you believe pfSense running on capable hardware and any OS, even Win10, might allow one to bypass the limits most routers place on concurrent connections? Do you think if you doubled the RAM you could double the size of the sessions table to somewhere over 2,000,000?

Shu · October 27, 2024, 2:52am

To give you an idea, I am only using 5% of 32GB off RAM and that itself is handling the 1M sessions. The limit I set on it was 3.2M, but that can be raised easily, though I don’t have a need to raise it at the moment. Currently, I have seen the active usage go as high as 1.2M so far.

There is plenty of future buffer room provided CPU or other constraints don’t show up first, though there is always the next bottleneck (unknown), when scaling out, always.

pfSense runs on FreeBSD only as far as I know.

VaCrunch · October 27, 2024, 3:04am

Running nodes? How many?

Topic		Replies	Views
Can't get my mikrotik router to run anymore than 150 nodes Support	8	328	August 18, 2024
Router & Network for the rest of us Community	13	229	July 12, 2024
Nodes possibly wasting bandwidth? Support network , nodes	6	187	April 7, 2025
Nodes from home issues Support	17	292	May 17, 2025
Router logs, unreplied udp connections Support	3	85	May 16, 2025

Routers for advanced setups

Related topics