I have a dev friend with setups in the cloud with significantly weaker machines with so many nodes that earn significantly more and on that I based my assumption that it is related to the quality of the internet and loss of connectivity over time - because the last months of the beta were earning well on significantly higher load. But anyway I will test with 1131 nodes and I will tell you the difference.
I would like to weigh in here with my experience. Personally I do not think the lack or earnings have nothing to do with connection.
As most people know I run both from home and from the cloud. When it comes to data chunk storage, before TGE my home nodes performed close to equal (though slightly less) than my cloud nodes. After TGE home nodes have dropped off horribly, earning around 10 to 15% of what my cloud nodes are (have been: Iāll get to that) earning. My homes nodes are behind a very strong and capable router, are connected with fiber and have a very stable and high bandwith connection (not close to the limit). Both setups are equally loading the systems and have around the same amount of nodes running per device.
My earlier assumption was that this had something to do with latency, where nodes try to find nodes closest to them and most of the nodes being run at central places like data centers (Hetzner for example). I kinda expected that because of this, home nodes far away from those data centers are eventually going to lack connections as they do not have as many nodes nearby as the cloud servers.
Whatās weird is that recently I slowly started killing my server nodes to restart (using a different reward address), but now earnings have dropped off by around 50% since. So the newer nodes are apparently not earning. Same servers, same settings, just started a week or 2 later.
Iām not sure if this is at all helpful, but I do think it paints a pretty clear picture that server load is not really a thing in general (perhaps for some individuals it still is).
I have a feeling that the problem might be with the ISP, server halls like Hetzner have no ISP in the way, from home data still needs to go through the ISP and that might have some impact and limitations.
Maybe @Josh have some thoughts on it.
I am currently getting 11-15 ANT rewards from 2.4k nodes. 5 physical machines + 2 VPS, all running since start with no restarts.
This could be about quality of BGP connectivity and peerings. Hetzner has motivation to have good and reliable peers everywhere. Your average home ISP needs line to Google, Meta and few others and for the rest the cheapest option will do. Not many customers will complain that communication to random addresses on the other side of the world is slow, has packetloss and sometimes doesnāt work at all.
Yes, this is very important and hard to pinpoint. Also HW plays a big role - CPU caches, RAM latency, network card and its driver.
Another thing is OS tuning. Defaults are for general use on average HW, it may be highly ineffective for running thousands nodes on multicpu server. Sometimes changing few kernel parameters can make huge difference.
Btw has somebody tried setting nodes with fixed CPU affinity? In theory it should reduce CPU cycles needed for context switching, especially on CPUs with multiple core groups or multi-CPU systems.
I donāt believe the lenght of the cable was the issue in term of latency. Probably the cable was bad, not bad enough to loose link but maybe packetloss, malformed packets that need to be resend etc.
Just to note that the same setup was working fine on a much heavier load for the last 2 months plus it was working fine the first few days and gradually reducing the token yield to zero, it hasnāt stopped abruptly.
Have the nodes, that earn less, actually been shunned? Can you see that from the logs?
Also, when talking about āearning 50% lessā (for example), I think that the only way to make that statement comparable to any other number it must be:
50% less in an exact same UTC hours, than the other nodes it is compared to.
Also, I think we would need to see some good statistical analysis about what kind of variance is expected within certain sample sizes.
But, the most important question, has the nodes been shunned or not?
Just to add some info I have a hetzner running 1k nodes earns about 10 ant a day and home nodes (multiple machines running from 50-200 nodes each) earn between 1-2 ant per day each so pretty equal for me.
Until my mining stopped, I kept the same amount of nodes - 3000 per machine. I then scaled up to 6000. Now on 1 machine Iāve throttled their initial launch to 1100 nodes and will monitor how it plays out.
I think they are not completely shunned. If I have understood it correctly, itās not like some node is shunned or not. It is shunned to some other node/nodes, not necessary to all other nodes.
Since the nodes are spread all over, there are quite much always some node pairs shunned.
But this is speculating, maybe we should try to prove it true or false? Either by searching the logs or by reasoning based on the real behaviour of the nodes?
Not really other than my finding that a group of machines on a very long ethernet cable were earning significantly less.
I observed higher ping times and attributed it to that, Rob suggested packet loss but I couldnāt be bothered hunting the issue down and just moved them closer. Solved the issue