How to reference a DataMap?

smacz · October 28, 2015, 3:41am

What I’m ultimately searching for is a way to represent a DataMap in 64 Bytes (Identifier in Structured Data).

The questions I’m juggling are:

what is a datamap? (Is it a vector?)
Is it one piece of Immutable or Structured Data? (Is it multiple?)
How can it be represented? (Can I take a SHA512 of it?)
How is it shared? (What is the location of a DataMap in the network?)

I’ve looked at self_encryption::datamap::Datamap to no avail. Any help would be much appreciated.

digipl · October 28, 2015, 8:58am

https://github.com/maidsafe/self_encryption/blob/master/src/datamap.rs

dallyshalla · October 28, 2015, 11:58am

pub enum DataMap {
    /// If the file is large enough (larger than 3072 bytes, 3 * MIN_CHUNK_SIZE), this algorithm
    /// holds the list of the files chunks and corresponding hashes.
    Chunks(Vec<ChunkDetails>),
    /// Very small files (less than 3072 bytes, 3 * MIN_CHUNK_SIZE) are not split into chunks and
    /// are put in here in their entirety.
    Content(Vec<u8>),
    /// empty datamap
    None,
}

dallyshalla · October 28, 2015, 12:03pm

if let Ok(mut file) = File::create("data_map") {

so you can represent it as anything you like I think

smacz · October 29, 2015, 5:00am

File::create means that it is a regular file, but if it gets too big, then it will split into chunks, right?

If it is split into chunks, does that mean there has to be a datamap for a datamap? (<-- this is where I’m confused)

dallyshalla · October 29, 2015, 5:40am

yes a datamap can have a datamap for itself, and that data map for the data map is a “dataAtlas”

if you password protect the data atlas you can store data maps behind the data atlas

plenty of aha moments here, I think that the explanation here is very conceptual and applicable for how fundamentally things work; this is not a code explanation, though I think establishes a good basis for safe network

smacz · October 29, 2015, 6:04am

Already have that video on my hard drive! I think I’ve seen it twice and skimmed it once.

The first hypothetical situation I have then is pretty tangible I think, so I would like to start there if I may. There is much talk about sharing datamaps as a way to “transfer” data. This concept I understand fully. The data is on the network, so you just need to inform the third party of where the chunks are on the network. To do this, you give them a datamap.

If that datamap is not big enough to have to be split into chunks, do you just give them the hash of the datamap to retrieve, and then they can retrieve the datamap and the data on their own? That would make sense if that’s so. One caviat though, instead of a hash, isn’t the Structured Data type (I believe datamaps are SD) located through a field called an “Identifier” instead of a hash? Either way, the data is still able to be retrieved from the network only using 64 bytes of location information.

On the other hand though, if that datamap is big engough to have to be split into chunks, do you give them the data atlas so that they can reassemble the datamap that you’re trying to give to them?

I may be misunderstanding the data atlas - as I previously believed there to be only one per user, and it having all of your personal datamaps on there. In which case, sharing the data atlas would not be a secure way to allow a third party to access many files.

dallyshalla · October 29, 2015, 6:53am

you’re right the first time on the data atlas - to encrypt your datamaps, the datamap maps data chunks, if the file is small enough, then all the bytes of the file are in the datamap. The datamap is also encrypted if it is a private file, so when you make a file public to someone you must give them decryption access to the file; I think further details are found in routing rather than the encryptor.

And if the file is public, then just the ‘identifier’ should be enough to start locating the file on the storage dht network.

sharing location of the data map and decryption to that datamap to access either it’s Content or it’s ChunkDetails depending on size of file to begin with

smacz · October 29, 2015, 7:16am

So I think I misunderstood what a datamap is. I think what I’m really asking about a data atlas. As I understand it now, any given datamap is only for one file. Either it holds the whole file (if small enough) or the chunk locations that make up that one file. The data atlas is the list of datamaps. That means that a data atlas is analogous to a directory full of files. (I know that there is a separate implementation for directories, but bear with me - this is just for the analogy)

So first question then becomes:

I can give someone a data atlas (and any decryption keys necessary) and then they will have access to all of the files contained within that data atlas. How is that data atlas referenced on the network? Is it one piece of Immutable Data or Structured Data?

digipl · October 29, 2015, 8:42am

There are reserved type_tags for private, public or shared directories, so possibly the data atlas will be Structured Data (otherwise, being only Immutable Data, its management becomes very difficult).

smacz · October 29, 2015, 8:53am

Now imagine I have many files that I want to grant a third party access to. So many files, in fact, that all of the datamaps cannot fit in one data atlas that is only one piece of Structured Data.

Would the preferred solution be to send that third party one data atlas that holds pointers to all of the other data atlases, or to send them all of the data atlases individually, or something else?

happybeing · October 29, 2015, 12:53pm

If data gets to big to sit inside one SD then you are supposed to store the data in a file and store it’s data map inside the SD (I THINK!). It’s described in the SD RFC.

dirvine · October 29, 2015, 1:33pm

Also there are directory entries (contain hashes of data maps) which again create a data map, so these can recurs like a filesystem (and they do). For very large datamaps (millions of chunks) the datamap itself can go back through self encryption as well.

happybeing · October 29, 2015, 3:26pm

Am I right in thinking nesting (of versioned, immutable data) more than one level is not for mere mortals like us (but something MaidSafe might be doing for SAFEDrive?), because for versioned data, nesting multiple levels can create version forks which are not able to be handled and resolved?

Or are what you are saying, and what I just said, referring to different things?

dirvine · October 29, 2015, 3:46pm

Yes the core code will do all this AFAIK and the API at clients should assume all is well. They guys will correct me if I am wrong (I need to get more info on client side, Spandan and Krishna just plough through at huge pace and make it all happen).

kirkion · October 29, 2015, 5:56pm

Can we all just stop and appreciate the humility of this guy? So many projects I have seen that failed because the leader was insecure and micromanaged the people following his vision. This is how you galvanize people, provide leadership to a community and recruit awesome people to carry on the vision

I think that someday people will carefully evaluate the maidsafe team in terms of a management problem, trying to figure out how they were able to do so much world changing work.

It’s so exciting even to watch all of this take place.

Topic		Replies	Views
Bug: self_encryption of small files, not encrypted Development	4	1039	June 22, 2015
Can someone explain the transformation used to go from datamap to xor addresses to fetch from network Support	16	190	November 24, 2024
Is the "data map" distributed? Beginners	3	831	March 5, 2015
Git Design: Pull and Push Features	6	789	March 7, 2016
Self Encryption on the SAFE Network Videos	94	5827	May 30, 2015

How to reference a DataMap?

Related topics