Memory usage of nodes does not seem right in this situation

neo · July 27, 2024, 5:31am

Background:
Wrote a script to start nodes in a local network since node-manager crashes with rpc cannot be contacted error randomly. Prob a race condition
Or node-manager just takes way too (hours and hours) long doing registry refresh when trying to run 500 to 1000 local nodes

Using my script (or node-manager) the memory usage per node increases as the number of nodes increases.

As in

upto 100 nodes each is taking the 80-120MB
as more nodes are added each node is using more memory
over about 450 nodes the usage seems to be 800MB +/- 50MB per node, and reports itself as 600+MB
No RPC port used
No Metrics port used
listening port set when starting the node (incrementing port # starting at 50000)
there was not a noticeable different with the rate of starting the nodes. The memory usage seems the same pattern/amounts if started quickly or long periods between starting each node
all nodes started normally as a local node with upto 20 peers specified. (20 when more than 20 existing)

So looking at total memory usage (using system monitor) I see 800MB per node when over 400 nodes and the node will report its using less at over 600MB and less than 700MB

The lower nodes do not seem to increase in memory as more nodes are added.

One possible reason i can see is nodes added later are using more memory to start while finding nodes to connect to, but never releases that extra memory. And is why the lower nodes are using less memory since they never needed it to start.

This will limit my testing using local networks. I only have 256GB main memory. The 48 thread CPU doesn’t have any problems handling the 500 nodes with less than 20% cpu usage.

I expected though that all the nodes would be needing the same nominal amount of RAM whether started early on or after 400 nodes are already started.

@joshuef @chriso who is the best to ask about why this occurs and if there is a way to not see the increasing MB of newly added nodes.

happybeing · July 27, 2024, 11:44am

I see this as a feature rather than a bug so I’d leave it like that. Doesn’t seem a priority even if you disagree!

Keep hunting!

storage_guy · July 27, 2024, 12:01pm

I see Node RAM usage for Nodes running in --home-network mode go up and up but ones running with ports set with --node-port reach a level and not really increase much. eg. for these nodes that have been running for 18 days. (VDash had crashed for the one on the 2nd screenshot.)

Nodes with --home-network

Nodes with --node-port set

These are just examples but it’s the same pattern across all the nodes.

happybeing · July 27, 2024, 1:16pm

Hack: start vdash inside a bash forever loop until the lazy author fixes such things. I did say ‘forever loop’

chriso · July 27, 2024, 3:37pm

I don’t think there will be anyone who can give you an answer to this without performing a significant investigation.

I would raise it an issue on the repository.

neo · July 27, 2024, 10:40pm

Not a priority, but maybe I’ve done something wrong and can told how to fix (or workaround) it.

If a feature then someone in the future will remove said feature and it will not have been tested.

But yes not a priority, one for down the track. The only priority is if I am doing something wrong and to find out what

Yea this though seems to be a specific thing with local networks (ie test network on your device). I say that because there are people running hundreds of nodes and I am sure they don’t have 256GB of RAM

This is a not RAM usage going up situation, its a situation where the 300th uses like 600MB and the 400th node uses the 800GB (showing 600+ in logs) from the start and it remains at that level only varying up/down by a few MB according to logs.

On my system VDash was working with the 500 nodes. But with uploading was showing inactive nodes which is either logs not being written or VDash wasn’t reading them fast enough for 2 minutes. CPU never above 40%

Thank you for the reply.

It does just seem to be a case of not returning memory to the system after the node finished starting up. Maybe after scanning for nodes to have as peers.

drirmbda · July 28, 2024, 8:57am

I have seen this on the main network as well. This, and the other bug where older nodes seem to earn faster and faster are good reasons for starting nodes early and for using a UPS. (Of course I have spent quite some time on looking into Rust memory allocation, and that on different architectures and OSes.)

system · September 25, 2024, 5:31am

This topic was automatically closed after 60 days. New replies are no longer allowed.

Topic		Replies	Views
Launch node memory usage improvement Support nodes	38	369	April 13, 2025
Can anyone tell me why Node memory usage is different on different machines (major difference) Support	21	326	November 4, 2024
Update 29 June, 2023 Updates	41	2205	July 6, 2023
How many nodes could one run from a rpi3 Ant-Node (was Safe-node)	12	275	January 23, 2025
Memory and performance testing Development	70	1996	September 12, 2022

Memory usage of nodes does not seem right in this situation

Related topics