So if I was to write a guide to setting up Formaiao from @bochaco, I should also include the steps needed for k8s for setups with more than say 20 nodes?
Looks like time for @Southside to refresh his k8s knowledge.
I think anything up to to 100 nodes or so should be okayish (maybe even 200 or 300ā¦but when we approach the thousand it for sure should be troublesome) ⦠And thatās a lot of nodes for localā¦
No worries. I got my infrastructure at home quite streamlined and automated over many years off hard work and input on that front. I am using LXCs (nothing to do with docker). I donāt worry about SSD wear out as all of its on rotational hard drives in a giant Ceph FS cluster. I donāt worry about logs either as those are all going to a separate temporary file ram system per LXC and used by all safenodes inside of the LXC with a cron job to trim it as well. This reduces CPU cycles and disk i/o significantly as well.
Everything is automated to spin up multiple LXCs per physical hosts with configurable initial input settings for # of safenode per LXC, safenode port range, safenode rpc range, safenode metrics port range etc, and it auto sets up the NAT port forwarding with NAT Reflection on the router.
Only thing stopping me from truly running everything to the max is high power requirements for all the hosts today, so still an issue I need to deal with the electrician lol.
The other locations where I have safenodes running, I pass through all the network traffic off all safenodesā via IPSEC tunnel to my main location, and then fan out to the world, and also in reverse. This saves the overload off limited ārouter nat session tableā problems on other locations depending on the modem and the router at play at remote locations, since I am not currently bottlenecked on NAT session table entries at my primary location, or so I think (at least not aware off it yet) (havenāt pushed it > 1M+ session tables yet).
Hello,
What router would you recomend to upgrade to?
currently using a 7 year old tilgin from my ISP. we have problem and delays loading webpages eaven without running nodes.
My IPS offer Dual-Band Wireless AX6000 Gigabit Ethernet Gateway - EX3600-T0 | Global | Zyxel Service Providers for just under 100⬠They can give limited support (remote login/reset/check).
are there better routers out there? usecase is home stuff and running safe and BTC nodes.
1M+ NAT table size sounds small for a setup limited by circuit breaker capacity. And in my experience the real key metric is the number of connections per IP address. So are you tunneling multiple groups of connections to multiple machines in the cloud to reduce number of connections per IP?
Second, have you measured UDP packet loss between nodes/UDP ports in your setup and between a home node and one in the cloud? My measurements were quite disappointing and led me to run fewer nodes per IP. (Residential fiber leasing multiple dynamic IPs
I am not sure I am following you on the above. Any unique combination of src port/ip & destination port/ip is in routerās memory. This is what I am watching as this is what breaks down on consumer grade routers (past say 8192 or 16384 active state) (most of the generic ones at least).
I am not using any cloud provider or anything to do with them. I am tunneling a group of remote machines at site A/B/C through a dedicated IPsec between site A/B/C <= (fiber connection) => D, and then outbound to the internet, while site Dās machines also goes outbound via site Dās router.
I donāt have much power capacity here, and my machines are old used hardware (not the latest manufacturing process in terms of nanometers
), hence they do draw decent amount of power.
I see. Thank you for sharing by the way, (and I am hoping for more once the competitions are finally over and focus shifts to increasing total number of network nodes!)
Iām using 2012 server tech and have unfortunately no reason to oversubscribe anything whatsoever purely because of my ISPās limitations. I hit a wall there and still hoping to find a way around it one day, other than going business grade or enterprise fiber.
So when you mentioned fanout I thought of it as reducing the number of states in the home router by pulling external public IPs into servers at home through a tunnel. (I have tried this using WireGuard and a second IP on a VPS in the cloud but latency was too high in my situation.) This reduces the NAT table at home.
You appear to be doing the opposite, aggregating connections from multiple sites and managing their state in a single NAT table. You may be more fortunate with your ISP and if you stay away from Linux based routing you should easily be able to deal with 16-32M NAT entries.
So your using Alpine as the OS and āOpenTofuā (fork of terraform) to create LXC containers?
Couldnāt find any info on utilizing Terraform for bare metal managementā¦seems orientated to ācloudā
Yes, I have power constraints and thatās likely why I canāt quite run as many nodes as my hardwareās true capacity allows, otherwise the main router is configured for about 32M max NAT entries at the moment, though I can increase that even more. I suspect the only thing ISP is doing packet forwarding on their switches once it leaves my router outbound after the initial hop to their gateway. I am not using any of their equipment (at least not from my home).
So your using Alpine as the OS and āOpenTofuā (fork of terraform) to create LXC containers?
Alpine image for the LXC.
Using terraform with the appropriate provider against all my physical hosts (they are all hypervisors), none run a bare metal OS without a hypervisor or virtualization functionality.
Please advise the providers that you use?
Are you able to utilize the Terraform free tier?
I donāt recall what terraform binary version I am using at the moment at home. Itās a simple binary to download (standalone), and use as is. Never had to pay for anything or use their enterprise level features
.
Terraform has plenty of backend providers: (xen, esxi, azure, aws, gcp, proxmox, etc), the list is endless.
Managed to find an extra source of power at home, so I decided to ramp up the safenodesā on one of my servers that was under utilized due to power constraints. Also, took the liberty to fix a glitch in my LACP setup for that server specifically, so now it has full access to 5Gbps of outbound/inbound bandwidth from the router.
After the above, started ramping up safenodes, and it managed to generate enough connections that the OS (hypervisor) was outputting:
nf_conntrack: table full, dropping packet
I think the default might be 65,536, so I upped it to 2^20 = 1048576 .
It should help keep the state of the safenodes much healthier .
Router is showing > 1.25 Gbps in steady state now, and over 1,000,000+ state tables size now.
Still ramping up (before an incoming power meltdown via a circuit breaker tripping), ,
.
Have you enquired about getting a temp service added to your place? You know the kind they put up for a construction site. Or are you in a multi-dwelling place and just cannot do it. Or maybe setup solar panels and powerwall to put some devices on?
Wherever I am located, I donāt have many options here, maybe none. Will be inquiring and contacting a electrician for sure, and more, as and when time frees up.
What kind of router is this that allows the user to increase the size of the connections table?
Feed in service line can often be the limiting factor for power
Its software based, I donāt like vendor lock in and I donāt use enterprise or consumer grade routers that you buy off the shelf etc.
Its my custom built server with my choice on the hardware specifications along with pfSense running on top of it plus a bunch off additional FreeBSD related tweaks. Session table size is only limited by the amount of RAM the server has. I am not sure off an upper limit here yet, but I only provided it 32GB of RAM for now.
Interesting. So, do you believe pfSense running on capable hardware and any OS, even Win10, might allow one to bypass the limits most routers place on concurrent connections? Do you think if you doubled the RAM you could double the size of the sessions table to somewhere over 2,000,000?
To give you an idea, I am only using 5% of 32GB off RAM and that itself is handling the 1M sessions. The limit I set on it was 3.2M, but that can be raised easily, though I donāt have a need to raise it at the moment. Currently, I have seen the active usage go as high as 1.2M so far.
There is plenty of future buffer room provided CPU or other constraints donāt show up first, though there is always the next bottleneck (unknown), when scaling out, always.
pfSense runs on FreeBSD only as far as I know.
Running nodes? How many?