Tried running safe files upload in multiple terminal tabs at the same time. Using a loop to upload different files in each tab. All files are different.
Observations:
taking about 1 second each small file for 1 terminal (no others)
taking about 1.1 to 1.2 seconds if 5 tabs concurrently uploading
each time I run safe files upload I specify a directory specific to the tab and deleted the previous safe.log before running safe files upload. So in theory each concurrent upload has a different log file it is writing
Some of the uploads result in log files with extra entries. (some from another concurrent upload?)
Question:
Is it supposed to be possible to upload concurrently?
Or is this a potential bug?
I am asking first before trying to capture a log entry and see if I can see the doubling up in the log file.
EDIT: It doesn’t seem to be the log file that is the issue, but checking the safe files upload std output (I sent to file to keep) showed the file chunked to a much larger number of chunks (anywhere from double the chunks to 4 times). A retry of the file later on chunked to 3 chunks All files are supposed to be 3 chunks so if common file directories or files are used then 2 or 3 files sometimes are chunking at the same time and so number of chunks in the upload includes the other file’s chunks
Further question:
is the chunking process not able to run concurrently - ie the concurrent processes for chunking using a common temp files or chunk artifacts directory?
EDIT2: Tried setting up multiple users and of course each user has its own set of directories and no problems detected.
I could separate the log directories when using the one user, but could not separate the chunk directory so that definitely seems to be where concurrency problems are with using one user.
I worked out the problem to be that self encryption uses a chunk artifact directory and the directory is the same for any client chunking being done for that linux user account. So if 2 concurrent SE is being done in two terminal tabs then the risk is if the processes happen at the same time and each will send all the chunks from the 2 files being uploaded. The log files showed 6 chunks being uploaded instead of 3, and this is how I saw there was a problem.
Of course this is me looking in from outside and there maybe something else going on.
I saw it because i had multiple clients running under the one linux user account in multiple terminal tabs and uploading small unique files (each 3 chunks long) in rapid succession. All files being uploaded are unique.
Hmm, we should be using unique dir per file we’re chunking. If you were doing the same file twice perhaps that’d happen which could well lead to what you’re describing too
Every file is generated uniquely. Using a seed that includes a incrementing number. Each tab started with the number starting a 1,000,000 and 2,000,000 and so on. each Tab was not going to overlap in numbers used. That way I ensured unique files across tabs and in each tab
I was thinking, as dangerous as that is, about this today and wondered if this might have an impact on directory uploads.
Does directory uploads perform concurrent upload of the files? If it does then it possible 2, 3, 4 processes try to upload the chunks from all processes at once. I assume there is a max number of concurrent processes uploading
depending on the number of concurrent uploads the effect could be great.
each process will chunk the file it is assigned to upload and place the chunks in the common directory.
this results in the chunks produced from all concurrent uploads being in the common directory
then a potential race condition exists as outlined below
each process picks the first chunk and goes out to collect quotes. NOTE: this is the same chunk for each process
the chunk does not exist yet so the nodes in the close group all give out their quote
each process then pays the node it chose in the close group, and uploads the chunk
rinse and repeat for all the chunks in the common directory
Thus you are overpaying
Now the kicker is that if an error occurs uploading due to multiple uploads of the same chunk to potentially the same node, then a retry occurs and each process pays again. This would continue till each process succeeds or node returns chunk exists.
Can you imagine how this could chew through your savings
It could certainly be an issue if things work that way but they have fixed the bug I was seeing in the current release so it wasn’t solely responsible.
I’m not sure that it works differently for a directory than it does for one one big file though. I think it chunks everything first and then starts uploading all the chunks in the same way.
the client using the same chunk artifact dir, meanwhile the SE also using the same temp dir.
To support cocurrent chunking, both of these two dirs need to be separated out.
Meanwhile, the chunking is really a resource comsuming work, hence not suggested to have concurrent work to be undertaken interleaved. As each individual SE will try its best to grab any resources it can.
are parallel downloads an issue too then? i guess i never tried so far …
…oh well … it’s not the self-encryption itself that isn’t thread-safe as i understand it but the upload-/payment-process … so worst thing that could happen is probably a corrupt download/downloading chunks multiple times …
are nodes self-encrypting/uploading anything? (wallet-sends (?) ) Are those made sure to not interfere with each others/client operations?
That really is only good for current workings. And then that is not good. People can easily be using multiple tabs. For instance someone might be uploading a 4GB video file or 50GB Blue ray image and want to also be uploading other files at the same time. That is not uncommon way to think/operate
I think a fix is needed to have sub directories in the chunk artifact directory for each upload happening.
When live there will be many apps out there like someone could be in a forum posing away, while they have another upload happening, and have a home automation that is uploading from time to time. You will then be having concurrency problems from time to time. Basically this means the client is broken with that sort of time bombs in it.
I had 15 client uploads (rapid unique files) across 15 user accounts on the one machine. Each user account was rapid firing uploads at faster than once a second. IE 15 files uploaded in 800 mS, every 800mSec on my local network.
And still was not at 100% cpu. I had to go to 15 user accounts because of this bug.
I don’t even have a github account. I’ve had no need and why I am making topics here.
Its the SE
And @qi_ma the wallet operations is a problem with concurrency. Before I found the bug in SE concurrency and only using one user account the concurrent uploading froze due to spends waiting for previous spend to finish when it never did.
TL;DR this is a bug. Just think with current internet that you couldn’t upload a file from a browser while sending an email. You’d say its broken.