Safe API - Registers

Aye essentially allowing for a max register size and then whatever happens inside of it is up to app devs.

I feel like this may be more complexity on the dev though?


Honestly I think registers might well need to be wrapped by a higher level struct anyway, ignoring the size of entries, just when you need to link two together, there’s metadata right there that’ll need to be included and added around that which we’ll need to work out.

1 Like

My feels are this. Registers are immutable (grow only) pointers to anything. This has the be the simplest of possible use cases where their purpose is to allow collections, history etc.

Registers == pointers
chunks = data

To me this seems as simple as we can make it.

Fair enough, but what I’m hearing does not seem like it’s undecided. Here and in the folders repo it appears to be when, not if Entries are limited to pointers. So I’m reluctant to go too far into that while I’m already distracted from the code I’m trying to understand and hopefully contribute to. Also, @neo’s suggestion addressed that in a way that doesn’t have to add much complexity - because it adds an option rather than replacing an easy to understand approach with a hurdle that everyone has to jump.

Providing options for either in the API, one way or another may be the best way to handle that, and again: not restricting the size to and index allows either approach.

I agree we want to have higher level APIs but I don’t see that happening soon. Great if it can.

Both. If we had higher level APIs maybe they would be able to address some of my concerns but we can’t really have a discussion about those without something concrete in form and time, which doesn’t sound feasible for some time. You guys have a lot on your plate already.

That’s fine as far as it goes but it ignores every point I’ve been making rather than answers them.

1 Like

I am not refusing. What I am seeing is complexity and I am trying to push back and say with pointers and data we can have everything for everyone in the simplest of API.

I’ve caveat-ed that a few times in my last posts.

@neo’s suggestion might make sense for some forms of registers, but also is in itself more complexity. (And kind of just shifts the question to how big should a register be)

Whereas the simplicity of @dirvine’s suggestion is quite neat from a network + typing standpoint.

(If we’re worried about GET size of a register, our registers are probably too big…)


how big should a register be

So this has not been looked at in a long while. Registers haven’t had heaps of love of late (but is nice to be talking about them, again!).

Some napkin maths…

On this point (assuming its not just pointers), it seems the natural choice would be smaller than a chunk (otherwise chunks will get shoved into registers… so < 500kb; an aside, but right now they are 1mb in size! :open_mouth: that will need to be brought down ) …

Registers themselves when stored are serialised, with all their logic etc as well as the entries… so that limits what that can be.

I’ve just run some runidmentary checks against the serialised bytes written to disk:

  • A one entry register is 350 bytes, two entries, 715 bytes. (for bao and next, 3+4 bytes)… so ~350 bytes overhead per entry.

So, ignoring the extra cost in theory of the compute of permission enforcement.*

  • 1024 (current max) entries by 350 bytes.

500kb max size minus the overhead per entry gives 141600 bytes in total for all entries, which gets us 138 bytes per entry.

Is 138 bytes still useful? (is that maths correct? its vary napkinny, please correct me if i’m wrong). If not, what is our max number of register entries and why?

(Incidentally the register address that was put on the network is d3ff73ee84ba7bf34c5f48d78396d827407c7d9728aae4ed59c88d8c556693ada3f1237634659d3050e06f4c072bbebb731001a9ae077c157f907337b20814a3ac987b85716b39d3869dd583e039e5f4 Which is 160 bytes… so, actually storing pointers via NetworkAddress would also require us trimming down register sizes!)

*They enforce checks on permissions etc, which is some compute resource… so chunk size - register overhead - compute guestimate per entry gives you your max register size… ? And then divide that by entries to get entry size? Something else to factor in there?

4 Likes

That doesn’t makes sense to me either but I’m running out of energy for this. I’m only here now because “cargo run” the examples rebuilds every target each time and I’ve not set up aliases to avoid that yet. That’ll be next because running the examples as described the the SN README is incredibly slow.

I’ve not seen anyone say they’re worried about the GET size of a register. If an app needs an index of size X entries it won’t make a difference whether that fits in one or ten registers.

the natural choice would be smaller than a chunk (otherwise chunks will get shoved into registers

I assume you mean “Entries” not Registers need to be smaller than a chunk? Either way, this seems a convoluted way of making sure Entries aren’t large enough for much more than a pointer, however you put it. What’s wrong with data in a register entry, other than type purity (that doesn’t simplify things when you come to build on it, but adds complexity and other problems).

FWIW I wasn’t saying that you have decided but overall here and in a particular comment on the folders PR (“when we …” rather than “if we …” move to pointers only), I’m not feeling like my points are carrying any weight.

So I’m going to step back from this because it doesn’t feel worth my time on this topic.

1 Like

I am really missing this point somewhere @happybeing If we add data to entries they become data storage. Logical conclusion is all data could just go in there and forget chunks etc.

So the point I am making is registers should not hold data, they should point to it and do so in a way that is CRDT and efficient.

I think what you are saying is they should hold some data, but not restricting what that “some data” or metadata is?

Is this right, or am I missing something more here. I don’t want you to get annoyed, but I Am not catching your logic.

7 Likes

Yes, I think you understand and I understand a bit more: you say no data, except we are finding the need for some metadata there regardless, for register version, permissions etc. So really you say no app data, and I say that makes things harder, less efficient for apps, and may have wider unintended consequences such as reduced take-up of important features such as versioned data.

For the most part you’ve said your reason was simplicity without explaining that beyond type purity. Now there’s a bit more explanation from you and Josh that this is also about network operation.

I enquired if there were issues around the network etc but this wasn’t explained, so that’s frustrating when I took a lot of time to explain and elaborate the reasons I thought it was good to have app metadata in Register Entries.

We’ve always had some app data in Register (and it’s ancestors’) Entries, at least the ability to include it, so I think a better explanation of all the thinking would have helped a lot here. Because I feel I’ve wasted my time for no reason and would like to make progress with my learning how to access registers so I can try building and helping out with this.

All my comments have been from the viewpoint of an app developer. I understand there are other issues and asked what they might be. The resistance is only now getting some meat on it but, as explained I now want to leave this to get back to learning and code which is frustrating enough at this stage of my trying to grok Registers, MerkleReg and so on and with running a local testnet being incredibly slow to not at all at the moment.

6 Likes

In the early stages small victories count as much as the big ones…

2 Likes

iiuc, a point @happybeing has made, that has not really been responded to, is that a filesystem has complexity and requires some metadata in order to be efficient “enough” for intended usage.

It seems to me that if MaidSafe is in control of both the registers API and filesystem implementation, then one of two things will happen after registers are limited to 32 bytes.

  1. The filesystem will be horribly inefficient, unusable, and eventually a conclusion reached that more metadata is needed in the registers. back to drawing board for registers API.
  2. The filesystem will be efficient enough to be usable, and the 32 byte limit stays.

However, if MaidSafe does not work on a filesystem and leaves that to others, eg @happybeing, then MaidSafe devs are never really eating their own dogfood to feel the stomache’s it may cause.

My guess is that (1) is the most likely scenario, but I’d love to be wrong about that.

btw, this whole discussion is very similar to the op_return debate in bitcoin-core, cerca 2013.

8 Likes

This is the point to consider optimisation mind you. Probably not beforehand. Then we can measure complexity and speed. At least that is how I see it.

1 Like

Why? What’s so great about having data there that it would become the preferred way?

1 Like

I think the point is that storing data in Entries is effectively free with the current model. Once you create and pay for a Register you don’t pay for updates.

I don’t know why that hasn’t been stated but it seems a good reason to be shy of storing arbitrary data in Entries.

The design may not be sustainable anyway, but restricting it to pointers may be in part to defend this no pay for updates model.

9 Likes

The complexity is just how many entries are returned. It will always be up to the App devs to decide how to use a register entry/entries so no more complexity there for Safe or App devs.

So its a simple calculation/process

  • API has #keys to return in its parameters
  • API builds in a trivial loop a value of concatenated keys or array of keys

But of course if Safe devs do not want that complexity/debugging then the APP could do it itself so the happy end is using the divine end and building its own routine to get the value size it wants.

@joshuef the real question is if there is another append type of data structure in Safe?? Thinking of how databases would work here. I know registers go part of the way to proving index structure but other forms of append data would be excellent as it would be an amazing feature to provide database records that are append style which means database records can have automatic history (The chunks would exist anyhow otherwise, just forgotten). To provide that with only chunks then a database record itself would have to be a register pointing to chunks which are each version of that record.

2 Likes

I agree, it’s bad comms for sure. The notion is that registers provide the updating data that we need, but in a simple way.

2 Likes

Why does self-encryption need 2 additional GETs? I don’t get it.

Why it HAS TO be a key-value store? Why not just simple “array”? You can use it as a key-value store or anything else. Why limit developers creativity? Just because of gut feeling of some possible unidentified “mess”?

Size should be maximum possible without negative impact on storage space, bandwidth, processor time… How much storage/bandwidth/processor would creating/reading/updating a 1024B register require? And how much a 32B register? How much, if each of them has 1000 changes in its history? Do we have that data? Can we count this? It’s very bad to base a design just on gut feelings and unmeasured specs. (edit: some of this is answered below)

I’d call it a freedom of choice, not a complexity.

How is that possible, if we have 1024B entries?

Exactly 4.3125 times more useful than 32 bytes!

So, once you paid, it’s not free, right? If you pay for a register, why not use it to it’s fullest?

Regarding napkin maths, I’d count it like this:
We can agree, that a maximum register size should be about a same size as a chunk, which costs almost the same (1 PUT), so ~500KB? Then, we want to have ~1000 historical states, which leads us to 500B entries, right? Minus overhead, of course.

But I have another idea. Maybe maximum size of register should be the only constant? And every entry could be any size developer wants? He could store 500KB/32B=15625 historical changes as pointers to chunks? Or, if he wants to save a history of a file, she could store all contents if the file is small, and if it grows in size, just store pointers? Or, perhaps no history and just one 500KB entry? It saves us from having to waste 1024B for a 32B pointers, or limit us to a too small entry size. Minus overhead, of course :slight_smile: .

Seems core devs say they want to make life easier for app devs, but when an app dev gives them his opinion, they try to convince him he’s wrong and they know better what he needs. When I see things like that, I want to fork :innocent:.

1 Like

Self encryption requires at least 3 chunks. So 1MB of data is split into 3 x 333KB chunks to do the self encryption.

If its just a chunk being stored without self encryption then its one chunks instead of 3 (Assume another form of encryption is done)

4 Likes

If there’s no other reasons, I think registers could be priced a bit higher than other data. They are supposed to be quite small anyway - compared to the general data storing possibilities - so I think a premium of 5% - 20% wouldn’t do much else than discourage the misuse.

But I thought there’s something else too, like them being CRDT and some extra work for the nodes related to that? In which case I think that should be priced accordingly.

We don’t know all the possible ways people may find to make use of the properties of the network, and that’s why I think everything should have a fair price. That’s maybe easier said than done, but in principle.

2 Likes

If I read this correctly then you are paying for each key written to the register. @joshuef ?

see below