The solution is for launchpad to do as safenode-manager and upgrade one node and restart it immediately then move onto the next. If safenode-manager kills all at once then restarts then it should do it one at a time as well
This way the time is in terms of a minute or two which should not see it shunned by any other node.
Temp shunning is another attack vector. In blackout situations or long term stoppage then the node should know from its logs it was stopped for a long time and use the old secret_key to decode its record_store and restart with new secret_key for a new peer_id and re-encode its record_store while offering up its records.
It doesn’t work like that. If you redirected ports in the router and did not set them in the LP, it does not know that there are redirected ports and uses random ones.
I was pretty sure safenode-manager did the upgrade one node at a time, restarting the node in the process (stop, upgrade, start for each node in turn).
Its in the quote I quoted. Launchpad leaves the nodes stopped and the user has to start all the nodes again with ctrl+s
And you seemed to be saying the safenode-manager is leaving them started since that’d be the way to process one at a time since the node program is sitting in the .local/bin directory and just download new version sitting it there.
Or are you saying safenode-manager also leaves the nodes stopped and the user has to do a safenode-manager start command again
Leaving them stopped was changed, so I am pretty sure that description is outdated now. But, even if they were left stopped, they are still being processed for upgrade one at a time, in sequence.
Yeah, as far as I’m aware, they do now. So the description is outdated. We discussed changing it on a call we had before we released the latest version.
Edit: the node manager always did this, but for some reason the launchpad opted to leave them stopped. It was the behaviour of the launchpad that changed.