I just read the SystemDocs, and what I read elsewhere leads to this question:
If I understood correctly, the vaults a chunk is stored on are determined by XOR proximity of the chunk’s hash and the vault ID. But I also read that upon reconnecting to the network, the node gets a completely new node ID.
If vault ID and node ID are the same, that would mean that upon reconnection, the vault would have to throw away all the data it has stored, because now it has a new ID which is no longer close to the hashes of the chunks it has stored, and thus it will never be asked for that data—and also loses the opportunity to earn safecoin when that data is accessed. It has to wait for new data to be stored and hope it isn’t thrown off the network again before the new data is accessed.
Somewhere else in the docs, it was mentioned that the network keeps up to 16 offline copies of data in its records as the corresponding vaults might come online again. That would point to the vault ID being constant, not being changed on every reconnection—but I’m not sure if this is just old information created before the random node ID was introduced?
@dirvine This is a part from your post (from the blog, joining etc) which I don’t get.
The joining node then connects (encrypted) to the closest nodes as returned by the bootstrap node.
So if I start Maidsafe, the client (the one that only puts and gets data) connects to another node and it know’s it’s ip:port and public key. That other node (bootstrap node) replies to me and provides me with a XOR-address that’s closest to the one I have? Or do I have my own routing-table and ask the other node (again, which knows my ip:port etc) to connect me to a XOR-address?
And vault (and similar) IDs are created randomly and then stored locally by the vault, right, not created from the user ID and password I’m using?
What I’m thinking of is: Say I have a VPS somewhere that I want to use for mining, and of course I’m using a vault at home that will be on and off the network much more often, can I create both with the same user ID and password without running into conflicts?
So just to confirm, the only way this can happen is if the bootstrap nodes that I’m connecting to know/recognize my node from before (since they are closest to me, they would have me in their list as well)? If they recognize me from the past, then they are free to provide a XOR address closest to my previous one, and the chunks I stored before are now back online. If they don’t recognize me (perhaps I’m a brand new node or an imposter claiming to be a certain ID), then they will assign me a totally new address and I will have to accumulate brand new chunks and ranking from scratch.
As I understand it, your PmidNode ID doesn’t change, it is stored on the network as part of your passport, and you can always reconnect with the PmidManager group that is closest to your PmidNode ID. I think the PmidManagers don’t need to remember you, your PmidNode ID is proof of you belonging there. The DataManagers decide which PmidNodes store the data that they manage. The PmidManagers are the intermediaries between the DataManagers and the PmidNodes.
It’s not possible for another node to impersonate you (or use your ID) without having your login information. They would not be able to authenticate to the network that they are the rightful owner of that PmidNode ID.
Hope I got this right.
Edit: Found most of the info here, pretty sure I got the above right: safenetwork – Medium
One of the articles in the blog you linked to says the following:
The PmidManager monitors the status of PmidNodes and notifies the relevant DataManager when the status of the PmidNode changes by either leaving or joining the network.
This seems to imply that they do keep track of vaults that went offline (how long they keep the info of vaults that went offline I do not know). If this weren’t the case, I don’t see what would prevent anyone from claiming to be a specific pmid.
You are both correct. If you think like this (I believe)
Network Address → Is a point in the network
Direct Network Addressable Element (DNAE) → A thing that exists (data Pmid client etc.) and can speak
Network Addressable Element (NAE) → an element that cannot answer (i.e. immutable data/safecoin/directory etc.)
So a vault etc. is a DNAE and there are several rules for these in terms of validity. A vault or client will have a key at that exact location and the key is validatable (It is a public key + signature as content and the name is the SHA512 Hash of that content, this also == the DNAE). So it’s crypto hard to create a exact key and when its stored on the network in a consensus group then even harder as the network will not allow more than one type (DNAE) at any address. This is the crux of the PKI system (replacement for certificate authorities). So you could have a client / vault / immutable data / safecoin all at a single address, but they are different types and split into two categories (DNAE and NAE). I am having a debate in house as these addresses are typed and I do not believe they should be, the address is like a gps co-ordinate nothing else. The values that exist there are the types. It’s not that important but extra code is a hateful thing for me, but it not a simple debate, both sides have strong arguments to make. Anyhow that is the PKI in general.
A network address in itself though is a mid point in xor space between a group of vaults (pmids). Each vault will have a different set of NAE than every other node, the closer together 2 vaults are to very close NAE then they may share those elements in their authority group, but they will differ at further away NAE (even though they appear to be in the network address, this is where we need to recall that no two addresses in the network will share any distance to any other node in the network (@eric explains very well in his lectures). Therefore a DNAE is very unlikely to have the same address as a NAE but if it does we do not worry as it’s not an issue as its managed by a group around it in any case.
Rejoining is not just a matter of sending the managers the plaintext PMID character string. There is private/public key encryption behind it, it works similar to Bitcoin addresses where you use your private key to prove that you are the owner of that particular address.
The main difference is that the PMID can’t simply be the hash of the public PMID key, because then you’d be able to mass-generate them locally to try to dominate a close group. I think @dirvine said that for now the MaidManager close group adds a nonce based on the distance to the node registering the PMID, but that they were looking for alternative approach.
@dirvine It does, although I have to read it multiple times to get it. But finally I will, and I think most of us… That’s how we learn isn’t it?
Another thing are the identities on Safenet. Trying to find an ip-address from a XOR-address is like almost impossible, I get that. But what about the close nodes? They know my IP:port:puclic ket etc. So if I ask for a chunk, they have to see it come by don’t they? They have to know me to get the chunk to my computer using my ip-address;port etc.?
And what about these identities. You’ve said that for browsing, a different identity is used than for a vault, so to speak. An identity is a pair of keys isn’t it? So if I connect on different days, different times to the Safenet, my closest XOR-nodes will probably change, but my identity won’t. How does the identity for browsing, asks for data? It needs to sent out a request for chunks or a URL (so to speak). So the close nodes again (new or old…) have to see data for the identity come by. That way they could find out about which ip-address with which XOR is using which identity… Probably you’ve prevented this in a very smart way, but that’s the part I still don’t get. I’ve been talking with @Melvin and others on Slack about this as well. Still some sort of magic to the most. Even after reading the system docs.
There will be a close node group that knows an Address. In routing_v2 we have a nice mechanism though. Connect to a random group, so they know the address. From there connect through the network to your manager groups. This kind of connection is encrypted so invisible to the close group, yes they can tell you are talking to another group, but with an ID not tied to a person and certainly not tied to a login, then the group become a proxy of received messages so IP snooping is made much harder.
I said previously though if you image any system going over IP there has to be knowledge of at least the connection to the network. To this end using IP connections means there is always a hole. I must add though, the XOR address you will download with is a random meaningless address, so no point in your ISP trying to look at the data (encrypted), so an attacker would need to get in a group, but where that group is geographically is unknown to him. He may note some nodes in the group that have an IP in the jurisdiction he is in and get a warrant (ha) or whatever to snoop and then with a mega list of chunks he is worried about (easy part) try and see if a node he connects to sends a request for that key. Of course clients need not connect to all group members, but that’s another story (we could force groups geographically apart based on IP lists (not nice). So thi attack cannot be underestimated, but then again think global autonomous network with no knowledge of peoples public names processing data, it is a significant expensive lottery to try this snoop.
There are also a few tricks we have but after launch for sure.
People get tied up in this part, the answers are available though to get around it if we need to, but launch first is my priority, this one does not bother me at all as I think it sounds much more simple than it actually is. You will connect to a group of globally connected nodes with no idea of who they are and investigate the IP range to see if you can get the owner of it and then raid a hose and take away a computer to prove somebody downloaded something and guess what, in Maidsafe you did not, it was all on a virtual drive in memory and you log out and presto no trail. Now this are is something to work on, will any apps leave a trail (write to temp) and if so which ones and what do they leave behind. So SAFE apps should not in any way leave such a trail and any trails should be dealt with, this is much much more important I think.
Great! Thank you very much. I think this is the part I didn’t get. My close group only helps me to setup a connection to my managers. Once that’s done, the managers can’t see my IP (they only see the XOR-addresses of my closest nodes) and the closest nodes can see my IP but have no clue what I’m saying to the managers. I think that’s already like a big wall of privacy.
Yes we will be testing in testnet3 I think Also connect requests are encrypted end to end so even hops cannot tell. It’s a balance though between connecting to a lot of nodes and being able to find some then next time. So we will try and open up mechanisms to securely provide bootstrap addresses with limited capability etc. so bootstrap and leave that node to join random nodes etc. It’s easier than many of the other issues we have had by a huge way though, logging into an ever changing random group of computers without transmitting any passwords (encrypted or not) is much harder