I am sorry neo, but I will have to circle back on this topic (as and when time permits), as I was making observations from the perspective of a single upstream with their notes, and trying to understand their reasoning for value of 10MiB to support 100MiB at 100ms and 1GiB at 10ms latencies etc:
This should be set to at least the expected connection latency multiplied by the maximum desired throughput.
Implementation of multiple upload streams (at which layer and process would those either are already implemented or would need to be implemented) in the current code base (antnode + libp2p) so to then not have an impact on upload bandwidth utilization for larger uplinks, if going below a 10MiB value (orders of magnitude less as you are suggesting then?)?
i.e. is it a drop in replacement without any negative impact or it needs heavy testing for both user groups, and with or without further code changes to antnode/libp2p tweaks?
Quic itself has multiple stream ability for a single upload. So libp2p could in theory have multiple streams for the one chunk with each having max window size.
But I would still recommend it being being limited in number since there could be many nodes uploading at once. Maybe 2 streams in QUIC, 3 max. This would maximise uploads with delays due to ACK/NACK response delays.
The real performance is obtained by having multiple streams at once. Not for the one node but for multiple nodes.
Yes, agree, but whats the current situation and current code base settings (it would need more input from others on the team).
All I was suggesting was to not make an order of magnitude change on a single setting, instead move the needle in the right direction (if required) for a per stream (min vs max uplink support at different latencies), and still gain the benefit if the decision to lower is indeed the right decision per stream.
If however, multiple upstreams at different layers are already automatically taking place, and other considerations and pre-reqs are met, then it would make sense to move the needle on the tuning even more.
As you can see, maybe its not just 1 constant change (to keep everyone happy), its more now in code to go a much smaller number, than keeping everything else equal and simply changing the stream value from 10MiB → 5MiB or 2.5MiB without turning other knobs too.
Overall, I do appreciate the insight and discussion on the topic, but out of time for today to think more about this. Cheers!
I suspect the big guys wanting to squeeze every drop of performance out will have the resource to tune up the number too. Maybe the default should be for the average home user though?
Please see my earlier comments regarding the impact on that setting with one specific stream (certain assumptions were made when reviewing the recent conversation).
If the current code base supports multiple streams today, and its a drop in replacement, I am not suggesting to not make the change…
At same time if the current code base as is with just 1 knob turned prevents one from using your current upload bandwidth significantly, it will be a huge deterrent / headache / delays to partners and organization that want to upload data to Autonomi (PBs) (personal opinion).
Overall, need to take both sides here, and make the right decisions (in tandem) to support both cases (personal opinion).
Remember a data centre operator uploading is also uploading with a batch size greater than one and will have multiple uploads at once if serious about using their links.
So no need for a single chunk to upload on more than one stream for the 10Gbps link operator. And if QUIC is set to 2 or 3 streams per connection then that even better. One upload is 4 chunks in batch times 2 streams each is 8 concurrent streams per upload process. 10 upload processes would be 80 streams and plenty to ensure 99 plus % efficiency.
The greatest inefficiency will be the download speeds of each node its uploading to and the delays for ACK/NACK will be similar whether the other node’s download speed is 100Mbps or 1000Mbps.
Is the code in the current state, where simply changing this stream window from 10Mb to 250KB (98% reduction as you suggested earlier) makes ant with its current default settings use 1Gbps properly or even 10Gbps properly against the network without any adverse affects?
If not, its not just this specific constant value being changed in code by itself that is required.
I was only reminding folks there are more than one use case and target audiences here. If we are going to change code, lets not shift from one extreme to another to suit one audience over another, and really try to aim for default values for ant and antnode, that handle 95% off the spectrum range, and leave the flags or env vars for that 5% edge cases.
I just feel we just got to make the defaults (concurrency/stream/batch size etc etc) play nicely (within a supported throughput range in Mbps being at certain min/max latency (within reason)) out of the gate by default, so the user experience is way less complicated and is close to just works for most folks (ant & antnode both).
Noting: Anything <= 10Gbps I think is not an edge case for bandwidth in general. To me, higher than this is where more advanced knobs will have to come into play (more processes, higher batch size, tuning of the stream window etc) (for seriously large players)) (personal opinion).
I don’t understand all the settings being talked about. I think that’s just a lack of effort on my part as I think you are both putting your points very clearly.
But my point is that if the project prioritises big users (whether nodes or big uploaders) over small node runners or uploaders with lower bandwidth then it has gone wrong. That would be a big departure from the original vision. And not just philosophically - it materially affects the resilience of the network. A million people running 10 nodes with ease is way more better than 1,000 being able to run 10,000 at maximum efficiency and speed.
We are talking about acceptable range and tolerance here, that the network should support (ant and antnode jointly), and make it so without a lot of configuration or tweaks to end user, so the user experience is great.
My concern was jumping too fast with 1 knob being turned only, and opening a cascading impact on other areas, without fully understanding the overall situation or making sure it solves the target goals (latency/throughput) what is reasonable to attain systematically with the code base.
If you have a latency of 10000ms and low bandwidth, do you still want to be supported compared to a big player with high bandwidth and super low latency in your use cases? The answer from both these sides will be yes, but both these scenarios have different concerns when it comes to default configurations in code base.
Noting: Its not that we don’t have a working network today, a majority of folks can connect, and yes, some do still have issues, but are they the 1% of 1%, or are they a much bigger number etc for which we would have to take further action to support (out of the gate) etc. If action is needed, it must be a positive benefit to both ant and antnode in tandem, else we have a regression in performance.
I uploaded the same file (one char changed each time) with ANT_MAX_STREAM_DATA set to 125000 250000 and 500000 and the default. 1000000 ?
ANT_MAX_STREAM_DATA=125000
Upload completed in 150.182018769s
ANT_MAX_STREAM_DATA=250000
Upload completed in 29.034228113s
ANT_MAX_STREAM_DATA=500000
Upload completed in 71.567204653s
ANT_MAX_STREAM_DATA= default
Upload completed in 42.583984625s
willie@gagarin:~$ ANT_MAX_STREAM_DATA=250000 time ant file upload ~/wallet-scraper.sh
Logging to directory: "/home/willie/.local/share/autonomi/client/logs/log_2025-01-16_00-05-35"
🔗 Connected to the Network Uploading data to network...
Uploading file: "/home/willie/wallet-scraper.sh"
Upload completed in 29.034228113s
Successfully uploaded: /home/willie/wallet-scraper.sh
At address: 2122250501373611285
Number of chunks uploaded: 6
Total cost: 18 AttoTokens
8.79user 5.70system 3:03.23elapsed 7%CPU (0avgtext+0avgdata 253188maxresident)k
38184inputs+8928outputs (163major+86422minor)pagefaults 0swaps
willie@gagarin:~$ nano wallet-scraper.sh
willie@gagarin:~$ ANT_MAX_STREAM_DATA=125000 time ant file upload ~/wallet-scraper.sh
Logging to directory: "/home/willie/.local/share/autonomi/client/logs/log_2025-01-16_00-10-04"
🔗 Connected to the Network Uploading data to network...
Uploading file: "/home/willie/wallet-scraper.sh"
Upload completed in 150.182018769s
Successfully uploaded: /home/willie/wallet-scraper.sh
At address: 17497534230246792202
Number of chunks uploaded: 6
Total cost: 18 AttoTokens
12.12user 7.71system 5:37.71elapsed 5%CPU (0avgtext+0avgdata 247716maxresident)k
8inputs+11440outputs (0major+86142minor)pagefaults 0swaps
willie@gagarin:~$ nano wallet-scraper.sh
willie@gagarin:~$ ANT_MAX_STREAM_DATA=500000 time ant file upload ~/wallet-scraper.sh
Logging to directory: "/home/willie/.local/share/autonomi/client/logs/log_2025-01-16_00-16-27"
🔗 Connected to the Network Uploading data to network...
Uploading file: "/home/willie/wallet-scraper.sh"
Upload completed in 71.567204653s
Successfully uploaded: /home/willie/wallet-scraper.sh
At address: 5194299803128246653
Number of chunks uploaded: 6
Total cost: 18 AttoTokens
9.53user 5.78system 3:17.71elapsed 7%CPU (0avgtext+0avgdata 256584maxresident)k
0inputs+9744outputs (0major+87014minor)pagefaults 0swaps
willie@gagarin:~$ time ant file upload ~/wallet-scraper.sh
Logging to directory: "/home/willie/.local/share/autonomi/client/logs/log_2025-01-16_00-39-04"
🔗 Connected to the Network Uploading data to network...
Uploading file: "/home/willie/wallet-scraper.sh"
Upload completed in 42.583984625s
Successfully uploaded: /home/willie/wallet-scraper.sh
At address: 214339557330243543
Number of chunks uploaded: 6
Total cost: 18 AttoTokens
real 2m52.237s
user 0m7.573s
sys 0m5.566s
I have 20 nodes running on this box, I’ll try the same from a VPS
Ah, sorry, I might have misconstrued something then. I thought we were talking about a setting that would set in stone and wouldn’t be changeable. If we’re talking about a default then that’s totally different.
If we’re talking about settings being possible to optimise for a user’s setup I’d say have the default to be suitable for close to the lowest common denominator for bandwidth, connections and buffers on routers, etc. Then it being possible to lower them for a terrible connection or raise them for people who know what they are doing.
Maybe even settings in the Lanchpad for:-
I live in a datacentre
I have lots of bandwidth at home and a fancy router
My internet connection is made of string and prayers
Thanks @Southside ! This is what I am talking about, one setting without adequate other defaults (considered or tweaked) within ant or antnode is by itself may not be the most optimal or improved out of the box experience for majority off folks compared to status quo.
We need a lot more testing on this.
How about diverting some of the Random Rewards to folks who will run a script that creates a say 2kb random file and uploads it with a set range of values for $ANT_MAX_STREAM_DATA ?
We really need a large group who will run set tests as and when needed.
From a Hetzner VPS with 35 nodes - nearly out of RAM
DEFAULT
Upload completed in 96.679032631s
ANT_MAX_STREAM_DATA=125000
Upload completed in 36.649111228s
ANT_MAX_STREAM_DATA=250000
Upload completed in 23.476114863s
ANT_MAX_STREAM_DATA=500000
Upload completed in 64.108910763s
worker@noderunner01:~$ ANT_MAX_STREAM_DATA=250000 time ant file upload ~/wallet-log
Logging to directory: "/home/worker/.local/share/autonomi/client/logs/log_2025-01-16_00-06-58"
🔗 Connected to the Network Uploading data to network...
Uploading file: "/home/worker/wallet-log"
Upload completed in 23.476114863s
Successfully uploaded: /home/worker/wallet-log
At address: 8610013773673882622
Number of chunks uploaded: 6
Total cost: 18 AttoTokens
29.26user 14.49system 2:40.70elapsed 27%CPU (0avgtext+0avgdata 321624maxresident)k
3488inputs+7800outputs (17major+103450minor)pagefaults 0swaps
worker@noderunner01:~$ nano wallet-log
worker@noderunner01:~$ ANT_MAX_STREAM_DATA=125000 time ant file upload ~/wallet-log
Logging to directory: "/home/worker/.local/share/autonomi/client/logs/log_2025-01-16_00-18-35"
🔗 Connected to the Network Uploading data to network...
Uploading file: "/home/worker/wallet-log"
Upload completed in 36.649111228s
Successfully uploaded: /home/worker/wallet-log
At address: 13621253930053536962
Number of chunks uploaded: 6
Total cost: 18 AttoTokens
37.57user 19.82system 3:41.41elapsed 25%CPU (0avgtext+0avgdata 314888maxresident)k
42440inputs+8696outputs (181major+104936minor)pagefaults 0swaps
worker@noderunner01:~$ nano wallet-log
worker@noderunner01:~$ ANT_MAX_STREAM_DATA=500000 time ant file upload ~/wallet-log
Logging to directory: "/home/worker/.local/share/autonomi/client/logs/log_2025-01-16_00-34-24"
🔗 Connected to the Network Uploading data to network...
Uploading file: "/home/worker/wallet-log"
Upload completed in 64.108910763s
Successfully uploaded: /home/worker/wallet-log
At address: 16255552327435558904
Number of chunks uploaded: 6
Total cost: 18 AttoTokens
45.56user 26.90system 4:33.36elapsed 26%CPU (0avgtext+0avgdata 333572maxresident)k
39368inputs+9824outputs (157major+108474minor)pagefaults 0swaps
worker@noderunner01:~$ nano wallet-log
worker@noderunner01:~$ time ant file upload ~/wallet-log
Logging to directory: "/home/worker/.local/share/autonomi/client/logs/log_2025-01-16_00-42-55"
🔗 Connected to the Network Uploading data to network...
Uploading file: "/home/worker/wallet-log"
Upload completed in 96.679032631s
The main conclusion I draw from this is that a best of 23 secs to upload a 34k file is pretty effing poor, I’m sad to say…
See below (its exposed as a env var for ant, with a certain default in code, so user can tweak it at the time of running ant CLI currently).
For antnode, I am not sure what the plan is yet, maintain the current value, or change the current value as the new default. Also, I am not sure if there is any plans to make that a configurable setting (I suspect likely not because ideally, each antnode should run same set of code base for maximum interop?, though anyone can fork the code base and tweak the settings).
What is important is that ant and antnode (default settings in code) provide maximum compatibility with each other, and not cause issues. At same time, we can’t just have the average throughput (with a supported target range (if that is set as a goal)) drop like crazy because of high mismatch of conflicting settings between each other that causes more pain than solves (my personal opinion).
@Shu Can the number of streams per connection be set too with environment var?
And the whole reason for checking into this was the issue of buffer overflows in routers. This one constraint is caused by the initial window size the QUIC protocol uses and if a problem then it negotiates the size down.
This means the max window size will potentially be causing buffer overflows in a number of ISP routers that only have 6 to 12 MB buffer space for outgoing packets. And this means that for some routers just having 2 chunks uploading at once could cause the router buffer to drop packets and cause retry and negotiating down. Translated into unnecessary retries and/or failure of sending the block.
We saw this in the first couple of 4MB max chunk size testnets where a large node “farm” went offline at once and lots of churning was happening and many other nodes failed as well due to “issues”
Basically I am saying that while for some a window size of 4MB will be fine for others with ISP routers could be having troubles with excessive retries when sending chunks. Yes many tweaks in the code has improved the situation, it still needs to be remembered when considering window sizes. A 250KB window size is a lot kinder to ISP routers in general, even with 20MB or 32MB buffers, than larger window sizes.
Also remember the 1/2MB chunk size networks did not exhibit characteristic comms issues like the initial 4MB did. Basically a 500KB window size seeming worked great and using 250KB for closing the issue of buffer overflows
@neo the testnets off 4MB vs 1/2MB , and why the network crashed in earlier testnets was not directly tied to the chunk size (it was premature to reach that type of conclusion), in my opinion.
I do not want to get into that debate on this topic. That topic was discussed in earlier updates by Jim and through the weekly Thursday updates at great length.
I can circle back with team on number of streams per connection as an env var, but I do not really know what is feasible or not.
I also do not know about antnode (how many knobs will be exposed, if any, at all on these low level layers).
import os
import random
import shutil
import time
from datetime import timedelta
# Set the Autonomi command for uploading files
UPLOAD_COMMAND = "ant file upload"
# Define the range of file sizes in bytes and the steps for MAX_STREAM_DATA env var
FILE_SIZES = [1024, 1024 * 10, 1024 * 100, 1024 * 1024]
MAX_STREAM_DATA_STEPS = [100 * 1024, 250 * 1024, 500 * 1024, 1 * 1024 * 1024, 2 * 1024 * 1024, 5 * 1024 * 1024, 10 * 1024 * 1024]
# Define the number of repetitions for each step
NUM_REPEATS = 3
results = {}
for size in FILE_SIZES:
results[size] = []
for max_stream_data in MAX_STREAM_DATA_STEPS:
print(f"Running with ANT_MAX_STREAM_DATA={max_stream_data}")
# Set the environment variable for current step
os.environ["ANT_MAX_STREAM_DATA"] = str(max_stream_data)
for size in FILE_SIZES:
print(f"\nUpload times for {size} byte files with ANT_MAX_STREAM_DATA={os.environ['ANT_MAX_STREAM_DATA']}:")
# Generate random files and upload them using the specified command
for _ in range(NUM_REPEATS):
for i in range(10):
temp_file = f"test-file-{size}-{i}.txt"
with open(temp_file, "w") as f:
f.write("test data" * (size // 10))
start_time = time.time()
result = []
for file in [f for f in os.listdir(".") if f.startswith("test-file")]:
command = UPLOAD_COMMAND + " " + file
print(command)
output = os.popen(command).read()
result.append((time.time() - start_time, output))
results[size].append(result[-1])
# Write the results to a file
with open("results.json", "w") as f:
f.write(f'{{"ANT_MAX_STREAM_DATA": {max_stream_data}, "{", "name": "file_size", "value": "byte", "unit": "bytes", "data": [{')
for size in FILE_SIZES:
f.write(f'"{{"name": "upload_time_mean", "value": {np.mean([r[0] for r in results[size]])}, "unit": "s"}},')
f.write(f'"{{"name": "upload_time_stddev", "value": {np.std( [r[0] for r in results[size]] )}, "unit": "s"}}"}')
f.write(']}')
I hope people realise I was talking of the github project that I quoted and not the Maidsafe devs, as I realise they were looking into this. That is what Bzee was doing.
Sorry if anyone took offence thinking I was talking of Maidsafe Devs