Proposal: Tarchive

Traktion · July 22, 2025, 10:33pm

I’ve been digging into this a bit more this week.

The chunk streamer library accepts an offset and a limit, which means downloading parts of a tarchive is also possible.

For a prototype, I’m going to try the following with AntTP.

For uploading:

Create a name.tar of the collection of files
Generate a name.tar.json meta data (or some such) containing offsets/limits of each file.
Add both to a public archive and upload it.
Adding additional files to the tar would mean generating a new meta data file and public archive, but deduplication will help with larger archives. For smaller archives, it will likely fit within the mininal number of chunks anyway.

For downloading:

The public archive containing the above is downloaded
If the meta data file exists and a file name is provided in the query, then the exact offset/limit is used like a http range query to extract the target file directly from the tar file.
Adding a LRU cache for all chunks downloaded will allow the above to be cached and other chunks/files to be downloaded quickly.
The LRU cache will also be used for other immutables too, providing a performance boost all around.

I can then investigate the performance and see how it stands up. It should mean cheaper/faster uploads, without losing flexibility to download each file.

It may mean downloading a full 4mb chunk for a small file, but bandwidth is good, with the latency being the issue. If other files in that chunk are needed, they will also get cached for near instant retrieval.

Instead of wrapping within a public archive, it may be better to create a dedicated type. However, I suspect most of the performance gains will come from the above.

I hope to have something working on Friday, but will see how much free time I get.

Topic		Replies	Views
Proposal for more efficiently handling small files Development	31	184	July 25, 2025
How Autonomi is different from other technology? Updates	60	1236	January 25, 2025
Community Public File Directory Community	83	1155	June 13, 2025
Self-encryption - design decisions Development	23	486	October 20, 2024
Update 30th January, 2025 Updates	108	1910	February 6, 2025

Proposal: Tarchive

Related topics