Update 09 June, 2022

maidsafe · June 9, 2022, 3:40pm

We had planned to go deeper into the governance issues we discussed last week but unfortunately haven’t been able to do so due to a couple of key members of the team needing to take unplanned time off. All being well, we’ll come back to this next week.

In the meantime, please read @jimcollinson’s post on our strategic aims, and be aware that our objectives and vision have not changed one bit. And, please remember to keep the discussions, however passionate and heated, always respectful. We have a forum code of conduct that all members are expected to uphold.

This week we’ll look at data on the network, and what it means for files to be public or private.

General progress

Yogesh has been looking at databases to replace sled db, which is buggy and doesn’t seem to be actively maintained. So far the prime candidates appear to be Persy, a transactional database that optimises for consistency, and Cacache which @yogesh says “seems to offer the best speed out of the lot with built-in metadata creation and handling”. Neither are perfect but both would probably do the job. Testing continues.

Thanks to @josh for organising the DBC comnet last week. As @Chriso mentioned, depositing owned DBCs isn’t working yet but this is what he’s been working on this week, and @Qi_ma is looking into a DBC reissue bug and also working on spentbook integration.

Meanwhile, @davidrusu continues to work on getting membership information to adults in order to ensure membership and network knowledge (via the signed Section Authority Provider) are in sync across the section.

Public and private data on the Safe Network

What is a file on Safe Network? Simple enough question but the answer is a bit more involved. The basic answer is “content + metadata + datamap” - but what does that mean?

Content

Content is the raw material of the file, the basic binary information. Once this gets to more than 1 MB it is automatically self-encrypted to produce chunks and a datamap. Because of the way self-encryption works, this is deterministic, i.e. self-encrypt the same content any number of times and you’ll get the same chunks. Its security is largely independent of the encryption algorithm (we use AES256) meaning that if the algo is cracked the chunks are still secure.

OK so what is a chunk? Unless you have the datamap, a chunk is a meaningless blob of bits, mostly around 1MB in size with a name that’s also its hash. This means we can check if it’s valid – does the name match the hash – but we can’t tell anything else about it. We can see it but we can’t read it, or know where it came from.

Datamap

Right, so what’s a datamap? The datamap is a simple file that contains the unencrypted name of the content and the names of all the encrypted chunks that make it up, so we know where to find them (chunk name == Xor address). If it’s stored unencrypted on the network then anyone can use it to recreate the content. If it’s encrypted or stored on our private client then only we can do that. We’ll come back to encrypting the datamap in a second.

Metadata

And the last thing we need to mention is the metadata, information about the content. This optionally includes its size, its name, the file type and potentially date created, accessed etc. But wait a second, Safe doesn’t do time! True, but that needn’t be a limitation.

The reason we don’t include metadata with the content is it would ruin deduplication. Let’s say someone uploaded the Sex Pistols song GodSaveTheQueen.mp3, and someone else uploaded exactly the same MP3 but called it GSTQ.mp3. If the name was part of the content the chunks would be completely different so there’d be no deduplication. This means we store the metadata separately from the chunks. We can store it in a datamap on the network or on our client, which allows us to arrange these apparently meaningless blobs to our hearts content, name and label them as we wish – including time created and time accessed – and organise them into our own directory structures.

Directories can also be content, encrypted, chunked and stored as files with their own data map (which is why small files which don’t go through self-encryption are unreadable – all content is stored in a directory, but that’s one for another day).

Public and private data

The way Safe works is that data that is valid must be stored. This means we can’t delete chunks. But remember files are content plus a datamap.

Content is just meaningless blobs without a datamap, and those blobs are as secure and unknowable as is possible with current technology. To make GodSaveTheQueen.mp3 publicly available we upload it, publish its datamap on the network unencrypted and link to it. Chances are, with a well-known song like that the chunks will already be there but the original uploader, who named it GSTQ.mp3 chose to encrypt the datamap or keep it on their client and therefore private.

So that is the basic difference between public and private data.

If we encrypt the data map with a BLS key, this also allows us to create key shares that we can then send to other people, meaning we have shared private data. BLS gives us this magic for free. This means public/private and shared data are all client-side actions. The network stores data forever and clients use the (root) data map and encryption to make data public, private or shared private.

Useful Links

Feel free to reply below with links to translations of this dev update and moderators will add them here:

Russian ; German ; Spanish ; French; Bulgarian

As an open source project, we’re always looking for feedback, comments and community contributions - so don’t be shy, join in and let’s create the Safe Network together!

RedPill22 · June 9, 2022, 3:41pm

Blind squirrel got the acorn, I mean Gold!

Secretariat415 · June 9, 2022, 3:55pm

Thanks so much to the entire Maidsafe team for all of your hard work!

I’ve noticed that the Uniswap DEX that was set up for eMAID hasn’t been used in a few days.

anon61899651 · June 9, 2022, 3:59pm

Wish I could get something other than bronze for once

TylerAbeoJordan · June 9, 2022, 6:08pm

A lot of parts in motion as usual super ants!

I don’t know if a typo in the update, but it reads ‘caracache’, but appears to actually be called ‘cacache’.

While I didn’t notice if Persy used async, cacache does, which seems very nice. The cacache DB also looks simpler than the Persy - and if the best part is no part, then that may be a plus too, depending on requirements.

As usual, thanx for the update! Cheers.

aatonnomicc · June 9, 2022, 8:14pm

Well done to all the team another well thought out and informative update

JPL · June 9, 2022, 8:28pm

So chunks are like grains of sand, each one different, each once being part of some rock that was ground up by the sea and carried by the tides to make its origins unknowable. AES is quantum-proof, and self-encryption extends the protection further which is good, but the data map is an obvious weak point for private data. I wonder how that’s going to be protected.

happybeing · June 9, 2022, 8:39pm

What do you see as the weakness? The data map is encrypted and the keys are accessible only to the owner and anyone they share them with. So it’s a bit like a wallet.

JPL · June 9, 2022, 8:46pm

Just thinking ahead to the quantum era when all current asymmetric cryptography will be vulnerable

neo · June 10, 2022, 12:24am

I would think that the uploader when uploading sets the metadata according to the info they supply for the file. IE in the datamap <----- this is like a universal meta store for everyone

Then anybody, including the uploader then can store a set of meta data in their own directory structure allowing renaming, last accessed (not affected by others) etc etc. <------ this is personal meta store

dirvine · June 10, 2022, 8:49am

I think we have a neat fix for that also. I will explain at high level with some points (to show work in progress)

AES is quantum-resistant for at least foreseeable
We use self encrypt for higher levels of security than that for chunks
We could default to a double symmetric enc mechanism for the root data map
As we use a base asymm key that is never exposed to generate more keys specific to events, i;e; a key per register, website etc. we can use the same for your root data map.
So AES(chachapoly(Hash(base secret key + "your root")))

This gets us past trusting a single algo and using 2 of them with a very strong derivable key. I like the simplicity of that, but still feel it’s worth checking for even better.

This is what I was writing up yesterday while travelling (to a funeral). I see this as the way to separate content from personal views (metadata) and allow folk to use timestamps if they wish on their personal data. Also worth noting with BLS encryption we can share keys easily, but the above does not do that for quantum resistance. So some work here yet.

neo · June 10, 2022, 10:20am

Just to be clear, and from the response I am not sure it was seen. The idea is for any files, public or private. The original meta data is supplied by uploader and stored with the datamap for private/public since private data can be shared with the data map. Then each user when they add it to their directories will have another meta data set in the directories allowing each user to have last accessed date, permissions, etc for each place it exists in their directories.

This allows a universal known set of meta data stored with the file/datamap and local meta data and of course the user can reset their local back to the global. Advantages here is that a global meta with filename video001.mp4 that is public can be stored by the user as “Queen at Wembley Stadium.mp4”

dirvine · June 10, 2022, 10:28am

Getting close to tags (cc @joshuef ) and possibly RDF (cc @happybeing ). Interesting angle.

I suspect just public or privately shared as private does not matter so much as original uploader can do what they want and nobody knows.

Unless we go tags/rdf where the content can apply to many tags/dirs/links etc.

DeusNexus · June 10, 2022, 10:50am

Any idea when the Safe Browser will be able to view files that are stored on the community / public test-networks? To me that allows the non-technical to glimplse beyond the wall of the CLI and see actual sites hosted load in front of their eyes.

jonas · June 10, 2022, 3:18pm

I love that you chose a picture with a hot-air balloon, @DeusNexus - reminds me of the logo for Smalltalk: Smalltalk - Wikipedia

chriso · June 10, 2022, 5:42pm

As far as I’m aware, things like the browser would be a post-launch concern. The intention for initial launch would just be the CLI.

19eddyjohn75 · June 10, 2022, 6:00pm

Thx 4 the update Maidsafe devs

Keep hacking super ants and never give up, we’re so close…

tfa · June 10, 2022, 9:34pm

I think the limit is 3 KB

Dimitar · June 12, 2022, 4:27am

Thank you for the heavy work team MaidSafe! I add the translations in the first post

Privacy. Security. Freedom

Topic		Replies	Views
Update 23 September, 2021 Updates	95	5622	October 11, 2021
Update 06 April, 2023 Updates	103	3575	May 17, 2023
Update 06 January, 2022 Updates	27	3794	February 3, 2022
Update May 26, 2022 Updates	26	2541	May 28, 2022
Update January 13, 2022 Updates	41	4710	February 5, 2022