Best Safe Node hardware

It can also be nothing wrong with your setup. It can range from the node doing the shunning having internet hiccup of a few seconds, your provider having a hiccup to your setup having a hiccup. My ISP provider or the NBN itself sometimes has a few seconds of “paused” activity. Not enough to wreck my son’s games but he notices it and I rarely do.

1 Like

Yeah, not too worried about the 5% shunned rate at the moment. Will monitor it over time, as long as its relatively low % its okay. There is an ongoing background scrubbing going on my file system (had disabled it for a few weeks and recently re-enabled it), so I suspect that maybe delaying i/o a little bit and could have contributed to the shunned counts rising. I suspect once that operation is over (caught up), it might all be okay.

Yes, thats a good estimate, provided they were already running the family PC 24/7 as baseline and consuming X amount of watts in steady state with some CPU / RAM / Storage to spare for antnodes.


I opted to go for higher density machines than very small SBC computers (multi purpose as well). Between all the networking requirements (ports/wires), and the different form factors off SBC, it kept me away from them, so I only bought old refurbished 2U servers (years ago), so they could all fit in a rack nicely. I prefer to maintain less hosts than more hosts (less operational burden), as long as there is a minimum amount of hosts to full-fill HA requirements etc.

3 Likes

Yea, that was an implied condition. But still if they run 18 hours then its going to take longer to use that KWH, and longer to earn those tokens. In the end it should still be within 20% for a good ballpark example.

I chose to run a couple of SBC’s since I do not have the ability to get good upload b/w as well as I have them for other reasons anyhow. They will sit in my 3D printed 10" small depth rack with my router and ISP stuff hidden away in the new bookshelves I am putting in. I am hoping that the resource requirements will allow me to run 20 nodes or even 25 nodes on each of them again. Used to be able to run 30 a good while ago.

Then my PC which I run while I am awake can run a set of nodes as well to get up to the 35Mbps I have for node running.

I suspect that once downloading is happening that the number of nodes per Mbps upload will reduce and maybe to the point where nodes are like 1Mbps averaged per node upload needed and the 2 SBCs will be plenty to max out my uplink (40Mbps nominal)

1 Like

@Shu ‘Background file scrubbing’, are you observing block re-org(defrag/GC) or is it higher level File System F/S. On the latter which one are you using ? (ie ext3/4 or something else)

What tools are you using to observe these activities?

It could also be with older SATA FLASH storage the lower level integrated block controller doing the FTL “Flash Translation Layer” is doing wear levelling moves from time to time,

So occasionally a system will have this all going on at once, which chews up cycles,

older file systems like ext3/4 are notorious for doing this ‘periodically’ as a ‘batch operation’ , as your drive fills up with files, i/o , especially writes will take longer and steal more CPU clock from foreground operations, such batch operation design is triggered into action by some ‘primitive’ accumulated state value (ie- the number of contiguous blocks of a certain length versus overall number of blocks possible given the lower level format technique).

The only remedy is to use a replacement in memory FTL LKM with low write amplification that works in RT to opportunistically run GC/Defrag and wear levelling, which bypasses this old method employed by old HBA controllers of the SATA art.

In-memory LKMs must have UPS that supports at least 5 minutes of uptime during graceful (write finality) shutdown given whatever the max. write will be, the number of write queues and depth of buffers used for block writes is used to calculate the UPS duration minimum needed.

Same is true about NVME SSDs today, not all NVME onboard hdw controller logic per drive, doing FTL are the same,

The cheap consumer and DC grade drives have slower processors and less memory and do get bogged down doing GC/Defrag/Wear-levelling, as the ind. drive fills up. So these grades of drive run great empty, then i/o perf goes downhill from there.

The more expensive Enterprise NVME SSD drives have more memory and faster onboard CPUs so this keeps thing performant up to a point, but they are also 50% more expensive than the ‘consumer grade’ or ‘DC grade’ (shorter memory retention, non active on the shelf life) SSDs.

2 Likes

Are you able to run hundreds of nodes on the same machine without memory usage growing exponentially? Or are you running multiple VMs with a smaller number of nodes?

I have both scenarios, LXCs with say 50 nodes registered, and same LXCs with 700 nodes registered, depending on the physical hardware specifications. Memory usage isn’t growing exponentially:

While memory usage can grow based on handling more circuit relay connections depending on the distribution off private vs public nodes within the network, and/or total number of open connections, the team is actively investigating this area for better solution (more details will be shared at some point after vetting out internal builds).

1 Like

Mind if I ask how you’re collecting the data and generating the charts?

Through the use off Telegraf, InfluxDB, and Grafana.

2 Likes

Last night checking on the nodes after not checking for 2 days and there were 10 out of 60 nodes with memory ranging from 1.2GB to 10GB. Normal is 300-450MB

So the memory bloat is still a thing with the node

1 Like

Just saw this:

Don’t know if it makes a difference for node running or not.

2 Likes

This has reduced significantly with v0.3.3 and further tweaks:

290 watts with 870 antnodes with baseline of 240 watts at 0 antnodes.

Therefore, (290-240)/870*1000 = 57 mW per antnode.

I have plenty of headroom to further increase density of antnodes on this system still.

8 Likes

What operating system are you using? A friend got an hp z640 with similar specs to my machines and under Debian it takes significantly less RAM than my Ubuntu computers.


Check out the Dev Forum

1 Like

The bloat I talked of was the unusual increase memory usage of a random node. Memory leak and unlikely to be limited to a OS

I am using linux

I didn’t look into this issue before, but did spend some time today on it, and I can confirm the issue is persistent on v0.3.3 from my home nodes.

Team is aware off the issue (raised it up from my side) and will be looking into the root cause here. Once the nodes are running, however, memory usage stays roughly flat for the container/VM etc including each individual antnode itself, but the amount of memory used per antnodeX being reported (final steady state) seems to be correlated linearly to the number of antnodes running inside a single container or VM etc (in other words, density off nodes here seems to matter).

Below was the reported Memory in MB for 900 antnodes in a single Alpine LXC vs # Connected Peers:

CPU consumption per antnode doesn’t seem to impacted by the increased memory levels as far I can tell.

Note: Our internal upscaling tests were focused on increasing the number of droplets with same number of antnodes each when analyzing the impact off the newer builds, but we did not pivot on scaling the # of antnodes per droplet itself (separate type of test), as it was not part of the original upscaling plans, but we will likely accommodate this type of upscaling tests in the future internally as well once a proper fix is identified for this issue.

2 Likes

Yes the increase in memory usage as node count increases is still there.

I can only run approx 340 nodes on my big PC because it is using 220GB of the 256GB RAM and 16GB swap

No VM or containers, running directly on linux

Its like the node is affected by all the connections in use??

I run 265 nodes on several machines, around 30 of 60GB used RAM.

Its well known that the less resources a computer has the less RUST/OS allocates to the program. That only makes it worse.

Also you must include the swap being used. If you use 30GB out of 60GB and use 60GB of swap then that is 90GB being used

1 Like

Why would swap be used when 50% of RAM is unused.

It will depend on your swappiness setting.

I normally use sudo sysctl vm.swappiness=20

2 Likes

Swappiness setting, what da f’ck? :joy: I use default, use RAM until there is no left then pray to gawd and use swap until armageddon. :joy:

1 Like