I adjusted the typo, thank you. After you do ctrl U to upgrade nodes which can take some time, you will have to Ctrl S to start them again. Some community members have scripts to automate this.
Those instructions appear to be for the launchpad. The section I linked to, and that is relevant to me, is using the CLI.
It should be the same process with CLI. After upgrading you have to start them again
Why then does antctl upgrade --help
include --do-not-start
?
What am I doing wrong here?!
antctl add --rewards-address <redacted> --node-port 14001-14002 --count 2 --enable-metrics-server --metrics-port 13001-13002 evm-arbitrum-one
error: unexpected argument '--metrics-port 13001-13002' found
tip: a similar argument exists: '--metrics-port'
Usage: antctl add --rewards-address <REWARDS_ADDRESS> --node-port <NODE_PORT> --count <COUNT> --enable-metrics-server --metrics-port <METRICS_PORT> <COMMAND>
For more information, try '--help'.
Iāve tried all kinds of ordering of the command line and the above is the one suggested by the error output itself.
The --do-not-start
flag might seem unnecessary in this case as nodes need to be manually started. It could be a remnant of older configurations or use cases that no longer apply. So, you would indeed need to manually restart the nodes after the upgrade.
Ok so should the āFor CLI Tool Users:ā section in the topic include a step 3 to start the nodes again, for parity with the āFor Node Launchpad Users:ā section which does instruct to start the nodes again?
If the nodes are upgraded and then left in a stopped state, that means that for hours our nodes are offline. I imagine thatās not great for earnings. Is it not possible for the upgrade to be performed on one node at a time, stopping it, upgrading it, then starting it before moving to the next, to minimize downtime?
Yes, I will work on upgrading the documentation to reflect the changes. Regarding the node restarts, people in the community have created scripts to handle this process. While itās certainly possible from a development perspective, the team is currently focused on more fundamental aspects that are critical for the networkās growth. I understand how frustrating it can be, but weāre working on addressing these challenges, and weāll get there in due time.
Got it, thanks.
Worth noting in release. A change in dependencies is also important, because it influences how things work. Frankly, an update to autonomi 0.4.3
resulted in about 2x download speed increase in our tests. Thanks!
That is how it works, but the node doesnāt always start.
Iām noticing my nodes having to work 2-3x times harder too. Maybe itās folks doing more uploads or just because they are getting more duties due to the routing fixes, but interesting nevertheless!
So there is a discrepancy between what the help page says at antctl add -h
which says it is --metrics-port <METRICS_PORT>
and the error above which suggests tip: a similar argument exists: '--metrics-port'
But I canāt get either to work.
The ordering doesnāt seem to make a difference. But I do seem to remember from ages ago that --count
had to be first. Any ideas before I throw my computer out the window?!
Get rid of enable-metrics
metrics port enables that anyhow and might be conflicting as the enable metrics sets random metric ports
This format seemed to work at least in terms of getting the nodes added:-
antctl add --metrics-port 13001-13100 --enable-metrics-server --node-port 14001-14100 --count 100 --rewards-address <redacted> evm-arbitrum-one
But Iāll try again without --enable-metrics-server
as you suggest, start them with an interval of 2 mins and go to bed!
Thanks for your advice.
I have never enabled metric server, its only there if you do not use metrics-port
Now, with this command:-
antctl add --metrics-port 13001-13100 --count 100 --node-port 14001-14100 --rewards-address <redacted> evm-arbitrum-one
I get:-
Failed to start 1 service(s):
ā antnode1: Command failed with exit code 1: Failed to start antnode1.service: Unit antnode1.service failed to load properly, please adjust/correct and reload service manager: Device or resource busy
See user logs and 'systemctl --user status antnode1.service' for details.
Error:
0: Failed to start one or more services
Location:
ant-node-manager/src/cmd/node.rs:835
Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
So it isnāt picking up the node-port.
Nuts to this, Iām going to bed. It will all seem clearer in the morning!
The enhancements for the routing table came at some cost. We observed a CPU usage increase of about 26%, but we deemed it acceptable for what it would buy us. However, Qi was confident the solution was scalable.
I would anticipate the usage going back down again with further enhancements in the future. Some of the devs are taking a really deep dive into libp2pās implementation of Kademlia just now and finding some issues there. We may need to end up having our own fork.
However, the monitoring on our own nodes doesnāt show any huge increase in CPU usage. Weāve not had to scale down our node density per machine.
Thatāll be because in your previous attempts it left the first node in a bad state.
A reset and reboot is needed before adding nodes again
Maybe a reboot is all that is needed to fix the bad state of the service for node 1
Apparently that isnāt it though. I removed the antnodes, rebooted and started antnode1 by itself. It didnāt start and there is an empty log file but this file is in the log dir:-
~/.local/share/autonomi/node/antnode1/logs/critical_failure.log
With this contents:-
[2025-04-03 05:42:51.954578227 UTC] Node terminated due to: "UPnP gateway not found. Enable UPnP on your router to allow incoming connections or manually port forward."