Can't get a node to run on a 4GB Raspberry Pi 4. Works fine on a 8GB one

I have just made a slightly distressing discovery - I can’t get a node running on a 4GB Raspberry Pi4! I’ve tried on 2 of them I have deployed in friends houses with the same result.

I get this error:- ant node start ✗ Failed to start 1 node(s): ● node1 (1) — Process spawn failed: Node 1 exited immediately: Error: 0: node startup failed: Failed to create LMDB storage: storage error: Failed to open LMDB env: Cannot allocate memory (os error 12) Location: src/bin/ant-node/main.rs:155 Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it. Run with RUST_BACKTRACE=full to include source snippets.

After some googling and some Clauding and trying changing ulimit settings and allowing overcommit of memory I’ve just about given up. I think there just isn’t enough RAM for a node to run on a Pi4 with only 4GB with the way the the node works. I’ll try on a Ubuntu VM later to see if it’s Pi or Pi OS specifc. Yes, I’m using the Pi version of Ubuntu.

I’ve seen using top that a node claims 110.9 GB of virtual RAM! Even though a running node only uses less than 100MB really.

Does anyone have any ideas? Is this a known issue?

(sorry for the cross post with the forum but I think this is the best place because of the ephemeral nature of Discord!)

I am about to try on a Pi4 4gb

In other news the desktop app is now released for the cli-averse

Where did you download the ant binary for your Pi4?

I tried from downloads.autonomi.com/node/aarch64 but I get a Binary exec format error

I just used the curl command to download ‘ant’ and then did the:-

ant node add

ant node daemon start

ant node start

Works fine on a both the 8GB Pi4s I have in my small fleet. Total no go on both the 4GB ones. Bit of a pisser if they never work because I have another few to go out to people.!

Seems the problem is in the database initialisation. Guess the LMDB library routines want the memory for some reason to setup the database files

Is that something that can be easily fixed?

THere are many 4gb SBCs kicking about 8gb ones are a very different story.

Not sure, and likely would have to be done during build. A configuration change that gets passed to the library. The pitfalls of developing on higher end machines and testing on such

Looking forward to node v0.10.1 ASAP then.

Thank you for looking at it. Do you think it’s worth me trying to edit and compile? I’m wondering if the database has to be that size and it will just not work. I’m going to try later if I get time.

I’m really hoping it was a mistake or oversight and quickly fixed or they can make it work with a smaller DB if it currently actually needs a big one. As @Southside pointed out on discord there are a lot of 4GB SBCs out there.

I activated zram on my rpi4 4gb ram :smiley: the node started and ran nicely for quite a while … (but full disclosure - it just seems to have stopped somewhere during the night … no idea why that would have happened …)

doesn’t look super tight on resources to run a node on a pi4

Ok! That is very interesting! Thank you. I’ll try that first.

Are you trying with v0.10.1?

I ran ant update to v0.1.6 and downloads became a LOT faster - still not the blistering speeds we have seen previously but still…

I’ve just updated, reset nodes, added one - which is now 0.10.1 - and tried starting it again with the same result.

I will try the zram suggestion now.

Needs to be reported is my guess so it can be handled at the official build level.

Did it actually stop?? Look at teh processes running. I realised this morning that what the daemon thinks is running and what actually is is not always the same

I’ve cracked the problem! I have a suggestion for the devs of how to make it not happen but I have a workaround that works.

I have to do things in the garden just now but I’ll write it up later.

Also maybe zwap memory settings that compress memory could help or are there a large gap to what is needed?

“Yes, Raspberry Pi can use zswap memory on Ubuntu, particularly starting with Ubuntu 22.04 LTS, where it is enabled by default for better performance on devices with limited RAM.”

Practical overall system memory improvement: often 10–40% of RAM depending on how much swapping occurs and how compressible the swapped pages are.

Lightly compressible workloads (already-compressed data): <10% benefit.
Moderately compressible (typical app memory): ~10–30%.
Highly compressible (zeroed or repetitive pages): can exceed 50% for the swapped subset.

I’ve cracked the problem with starting a node on a 4GB Rasberry Pi 4. It was due to the combination of the RPi4 having 4GB RAM and a 512GB SSD!

8GB RPi4 with 128GB SATA drive = worked fine.
8GB RPi4 using just the 128GB MMC card = worked fine.
4GB RPi4 with 512GB drive = Error described above.
4GB RPi4 with 512GB drive = Error described above.

(I’d bought the 2 with 8GB when RAM seemed to be a big constraint a couple of years ago. When that became less of an issue I bought a bunch of 4GB ones. I put 512GB SSDs in them when last year it seemed like about 10 nodes on one RPi4 was ok in terms of CPU, RAM, network throughput and network sessions.)

I went down rabbit holes with zram and with overcommit settings which did not help.

After some googling and clauding the solution is to create a dummy file that consumes most of the space on the SSD. It can be deleted once you’ve added as many nodes as you are going to run.

The big hint that there was something going on with RAM allocation and disk size was that on the working RPi4 with 8GB an ant-node gets allocated 110GB of virtual RAM which was suspiciously close to the size of the disk.

This is the summary from Claude:-


The bug: ant-node calculates its LMDB map size based on available disk space (disk space minus 500MB reserve). On a system with a large disk (445GB free), it tries to mmap ~445GB of virtual address space. On a 4GB RAM system, the Linux kernel refuses this mmap call with ENOMEM (os error 12), even though it’s only virtual address space reservation and no physical RAM is actually consumed.
Why the 8GB system works: Its disk is only 128GB, so the computed mmap size (~110GB) is small enough that the kernel allows it on an 8GB RAM system.

The workaround: Creating a large dummy file with fallocate reduces apparent free disk space, which reduces the computed mmap size to something the kernel will accept.

The proper fix would be for ant-node to cap max_map_size based on available virtual address space (or a reasonable default like 100GB) rather than raw disk space, or to expose --max-map-size as a command-line argument so operators can set it manually.


So this is what to do if you find yourself in this situation:-

fallocate -l 350G /home/safe/.local/share/ant/dummy_file

Obviously tune the size of the dummy_file to be what you need it to be.

This resulted in an ant-node which had 94GB allocated to it. Hopefully that is enough. Based on what is in the source code it seems it is going to be different for different combinations of RAM and disk space anyway.

Then add nodes and start them.

Then you can just rm the dummy_file.

Full credit to Claude but in my defence I had found the right bit in the source that looked like it had something to do with it and was taking disk space into account and that I had different disk sizes between the 4GB and 8GB RPi4s.

 let computed = compute_map_size(&env_dir, config.disk_reserve)?;
            info!(
                "Auto-computed LMDB map size: {:.2} GiB (available disk minus {:.2} GiB reserve)",

I would have eventually hit on the idea of fooling it with the dummy file.

The whole thing would have taken me lot longer than the couple of hours I spent on this without my new friend Claude.

So now I can get the RPI4s in friend’s houses running some nodes.

Hopefully this behaviour with LMDB is something that can be addressed by the devs @JimCollinson ?

Excellent work, @storage_guy :clap: