Bug Report: “Node events channel closed!” in v0.3.9 – Affects early-started nodes (Windows)
Summary:
After upgrading to v0.3.9, a significant portion of my nodes (typically ~70–80 out of 600) shut down shortly after launch with the error:
Node events channel closed!
This issue did not occur in v0.3.8 using the exact same configuration. The problem appears specific to Windows, as Linux node-runners using --interval 60000
report no similar failures.
System Info:
- OS: Windows 11 Pro, i9 CPU, 64GB RAM
- Autonomi version: v0.3.9 (v0.3.8 works fine)
- Launch: 600 nodes with
--interval 150000
- UDP ports: unique per node, no conflicts
Observed Behavior:
- Nodes ≤ antNode367 commonly shut down
- Later-started nodes (> antNode367) stay running
- Logs show:
- QUIC
ConnectionClosed
with empty reason - Slowing of RT discovery
- Then:
NetworkEvent channel is closed
and forced shutdown
- QUIC
Notes:
- Not caused by high interval; happens early in startup
- No resource exhaustion (CPU/mem/ports) observed
- Likely a regression in v0.3.9, possibly thread/channel related on Windows ---------------------------------------------------------------------------------------------------------------------------------A note-usually, when I restart the nodes, the stopped nodes will restart without issues. Here is a snippet of the log from one of the nodes that stopped with this error: [2025-04-08T11:50:47.731067Z INFO ant_networking::driver 916] Set responsible range to Distance(18092513943330655534932966407607485602073435104006338131165247501236426506235)(Some(253))
[2025-04-08T11:51:02.722565Z INFO ant_networking::driver 916] Set responsible range to Distance(18092513943330655534932966407607485602073435104006338131165247501236426506235)(Some(253))
[2025-04-08T11:51:14.016478Z ERROR ant_networking::event::swarm 478] IncomingConnectionError Valid from local_addr:?/ip4/0.0.0.0/udp/58959/quic-v1, send_back_addr /ip4/65.109.92.166/udp/40023/quic-v1 on ConnectionId(2968) with error Transport(Other(Custom { kind: Other, error: Right(Custom { kind: Other, error: Custom { kind: Other, error: Connection(ConnectionError(ConnectionClosed(ConnectionClose { error_code: APPLICATION_ERROR, frame_type: None, reason: b"" }))) } }) }))
[2025-04-08T11:51:17.728505Z INFO ant_networking::driver 916] Set responsible range to Distance(18092513943330655534932966407607485602073435104006338131165247501236426506235)(Some(253))
[2025-04-08T11:51:21.066693Z ERROR ant_networking::event::swarm 478] IncomingConnectionError Valid from local_addr:?/ip4/0.0.0.0/udp/58959/quic-v1, send_back_addr /ip4/74.81.33.41/udp/7946/quic-v1 on ConnectionId(2969) with error Transport(Other(Custom { kind: Other, error: Right(Custom { kind: Other, error: Custom { kind: Other, error: Connection(ConnectionError(ConnectionClosed(ConnectionClose { error_code: APPLICATION_ERROR, frame_type: None, reason: b"" }))) } }) }))
[2025-04-08T11:51:28.882587Z ERROR ant_networking::event::swarm 478] IncomingConnectionError Valid from local_addr:?/ip4/0.0.0.0/udp/58959/quic-v1, send_back_addr /ip4/135.181.132.16/udp/35529/quic-v1 on ConnectionId(2970) with error Transport(Other(Custom { kind: Other, error: Right(Custom { kind: Other, error: Custom { kind: Other, error: Connection(ConnectionError(ConnectionClosed(ConnectionClose { error_code: APPLICATION_ERROR, frame_type: None, reason: b"" }))) } }) }))
[2025-04-08T11:51:32.724047Z INFO ant_networking::driver 916] Set responsible range to Distance(18092513943330655534932966407607485602073435104006338131165247501236426506235)(Some(253))
[2025-04-08T11:51:47.723508Z INFO ant_networking::driver 916] Set responsible range to Distance(18092513943330655534932966407607485602073435104006338131165247501236426506235)(Some(253))
[2025-04-08T11:51:54.878050Z ERROR ant_networking::event::swarm 478] IncomingConnectionError Valid from local_addr:?/ip4/0.0.0.0/udp/58959/quic-v1, send_back_addr /ip4/136.243.94.178/udp/25301/quic-v1 on ConnectionId(2971) with error Transport(Other(Custom { kind: Other, error: Right(Custom { kind: Other, error: Custom { kind: Other, error: Connection(ConnectionError(ConnectionClosed(ConnectionClose { error_code: APPLICATION_ERROR, frame_type: None, reason: b"" }))) } }) }))
[2025-04-08T11:51:57.720547Z INFO ant_node::log_markers 69] IntervalReplicationTriggered
[2025-04-08T11:52:02.446407Z INFO ant_networking::network_discovery 263] It has been 180s since we last added a peer to RT. Slowing down the continuous network discovery process. Old interval: 390s, New interval: 527s
[2025-04-08T11:52:02.464460Z INFO ant_networking::network_discovery 388] With min_full_bucket_index of Some(247), targeting buckets of [247, 246, 245, 244, 243, 242, 241, 240, 239, 238]
[2025-04-08T11:52:02.464505Z INFO ant_networking::network_discovery 128] Going to undertake 9 get_closest queries for non_full_buckets
[2025-04-08T11:52:02.473037Z ERROR ant_node::node 357] TheNetworkEvent
channel is closed
[2025-04-08T11:52:02.473400Z INFO antnode 450] Node is stopping in 1s…
[2025-04-08T11:52:03.480045Z ERROR antnode 459] Node stopped with error: Node events channel closed!