I’ve cracked the problem with starting a node on a 4GB Rasberry Pi 4. It was due to the combination of the RPi4 having 4GB RAM and a 512GB SSD!
8GB RPi4 with 128GB SATA drive = worked fine.
8GB RPi4 using just the 128GB MMC card = worked fine.
4GB RPi4 with 512GB drive = Error described above.
4GB RPi4 with 512GB drive = Error described above.
(I’d bought the 2 with 8GB when RAM seemed to be a big constraint a couple of years ago. When that became less of an issue I bought a bunch of 4GB ones. I put 512GB SSDs in them when last year it seemed like about 10 nodes on one RPi4 was ok in terms of CPU, RAM, network throughput and network sessions.)
I went down rabbit holes with zram and with overcommit settings which did not help.
After some googling and clauding the solution is to create a dummy file that consumes most of the space on the SSD. It can be deleted once you’ve added as many nodes as you are going to run.
The big hint that there was something going on with RAM allocation and disk size was that on the working RPi4 with 8GB an ant-node gets allocated 110GB of virtual RAM which was suspiciously close to the size of the disk.
This is the summary from Claude:-
The bug: ant-node calculates its LMDB map size based on available disk space (disk space minus 500MB reserve). On a system with a large disk (445GB free), it tries to mmap ~445GB of virtual address space. On a 4GB RAM system, the Linux kernel refuses this mmap call with ENOMEM (os error 12), even though it’s only virtual address space reservation and no physical RAM is actually consumed.
Why the 8GB system works: Its disk is only 128GB, so the computed mmap size (~110GB) is small enough that the kernel allows it on an 8GB RAM system.
The workaround: Creating a large dummy file with fallocate reduces apparent free disk space, which reduces the computed mmap size to something the kernel will accept.
The proper fix would be for ant-node to cap max_map_size based on available virtual address space (or a reasonable default like 100GB) rather than raw disk space, or to expose --max-map-size as a command-line argument so operators can set it manually.
So this is what to do if you find yourself in this situation:-
fallocate -l 350G /home/safe/.local/share/ant/dummy_file
Obviously tune the size of the dummy_file to be what you need it to be.
This resulted in an ant-node which had 94GB allocated to it. Hopefully that is enough. Based on what is in the source code it seems it is going to be different for different combinations of RAM and disk space anyway.
Then add nodes and start them.
Then you can just rm the dummy_file.
Full credit to Claude but in my defence I had found the right bit in the source that looked like it had something to do with it and was taking disk space into account and that I had different disk sizes between the 4GB and 8GB RPi4s.
let computed = compute_map_size(&env_dir, config.disk_reserve)?;
info!(
"Auto-computed LMDB map size: {:.2} GiB (available disk minus {:.2} GiB reserve)",
I would have eventually hit on the idea of fooling it with the dummy file.
The whole thing would have taken me lot longer than the couple of hours I spent on this without my new friend Claude.
So now I can get the RPI4s in friend’s houses running some nodes.
Hopefully this behaviour with LMDB is something that can be addressed by the devs @JimCollinson ?