Update 14 September, 2023

This will not be a tiny network. If it is, the project will be an abysmal failure.

This is way too literal. An analogy only needs to capture the essence. The 10 “records” (chunks) can (and will) be one purchase at one “location” (the SafeNet), where they all have different “manufacturers” (nodes) and different prices that change real-time. Your version is unnecessarily complex. The use of the term “price check” is very accurate, for that is all it is. At a store it is unlikely the price will change in the 5 minutes you might need to make the purchase at that price.
The key difference that was pointed out is not all chunk purchases will fail if some node prices increase during the process. So, some chunks will be stored and payment made, others could be rejected and have to be tried again. If you don’t have enough coin to cover the increase, you are stuck (temporarily at least) with only a partially saved file.

Let’s look at the price check process. Nodes would need to be contacted to get the prices since they can change often. What is wrong with requiring a signed quote from the nodes for a price and a duration (like 5 minutes or possibly less) that the quote will be honored (with a network standard minimum duration). If the non-interruptible process (payment and file upload transaction) is initiated before expiration, the quote must be honored. This excludes network delays during the process since the start time could be signed by the elder handling the request. The payment must be validated anyway, right?
Would this require significantly more consensus mechanism and complexity than is already needed?

1 Like

Clarification: This excludes network delays during the save process (not the quote process) … The exclusion starts when the non-interruptible process is accepted.

1 Like

Don’t take me wrong. I am always pushed for time so brevity

Signing costs cpu, but who do you complain to if the node does not then agree to store or does not store?

Again, whose 5 minutes? Yours, theirs or who do you agree what the time is and can you prove you tried to do X in some duration? (sign time with external oracle)

I hope you can see in decentralised networks these seemingly small programmatically correct things are not just very hard, they can lead to a whole new design of the network, introduce consensus, which then goes back to all messages and to genesis. This in then turns to total order and the whole thing falls apart or runs at 7 or so transactions per second globally.

A better way to think is this,

  • Assume most nodes are honest
  • Follow the colony principles of individuals acting in the interest of the colony before themselves

This horrifies some Engineers and Mathematicians, but it’s very powerful. You then remove the ability for the network to create or rewrite history etc. and you are onto something very fast and extremely secure.

It means though you need to keep thinking of how can a correct node cope with detecting malice, how can they help the colony (they ignore those bad nodes) and how can they ensure the data is stored and provided on request.

Then add in that unnatural thing (money) and keep that simple, if it’s a simple publish of owned data that transfers a value and can do that only once (one time key) then again the network cannot steal or create money.

Put that all together and you can see the simplicity of the SAFE network, it may look chaotic, it may feel insecure or out of sync with provably correct actions (not code) and then think about an ant colony, how does that work. Why have they survived 120million years and how does that function (none of them has a watch). The answer is simple rules, no (global) consensus and each ant doing the right thing (i.e. running the right code).

This is why any agreement on time, proving bad behaviour, getting agreement with other nodes and so in introduces a wholly different approach. The other approach I know of (and dislike a lot) is consensus driven networks, which by default require total order. These are very hard to get to do just money transfers, never mind anything more complex and valuable or operate at billions or more transactions per second.

I hope that helps and I hope you don’t take my previous quick replies as disparaging.

17 Likes

Thank you David, very helpful. I was not aware of this major shift in the processes. I know a lot of code has been removed during development. I do not recall reading this before.

If all saves are handled equally (user’s file uploads/saves are not a separate algorithmic process), then everything (now also including coin transactions) could add up to billions of transactions per second. And all of these require a storage fee (not just file uploads)?
This is a tough problem indeed. Everything needs to be kept to a minimum or delays pile up quickly. This would make signatures expensive since so many would be needed, even if they are “pre-signed” (having an older timestamp for creation, but are still valid based on duration {a self issued certificate}). Time doesn’t need to be agree on, the signature defines the start time and the duration for validity.
Without a signature (or equivalent), I do not know how any node can be held accountable; how do you know it was sent by the true node (authentication).
If the network just ignores nodes with “bad behavior”, what stops malicious nodes from making honest nodes look bad by impersonation?

4 Likes

They cannot really impersonate. Good nodes are already connected to other good nodes and in communication with them. At a later stage, we will introduce heuristics for nodes to see if neighbours are having close to themselves. That’s not important here, but basically there is no real notion of impersonation.

Sybil attacks etc. are handled (or will be) in the network layer, in quite a clever fashion, but for us it’s less an issue anyway.

If you really try to think of how ants work then it is much clearer, albeit more chaotic to us humans :slight_smile:

6 Likes

Same here, but I am meaning that if you go in store get a price then come back later, the price on the shelf may have changed. The petrol stations here turn off their signs as they change price. But you still can be 100 meters away and decide to fill up, drive in just as they go through the process and when you get to the pump it displays an updated price.

The point was the price can change within a minute between getting price check and when you come back to buy. In the case of the petrol station you see it while up the road some way and just drive in to see the price changed when you get to the bowser

4 Likes

I was just describing both situations, if large enough then rare for 2 different records being uploaded to end up on same set of close group nodes. But statistically it will happen some time with some 2 records in one file upload.

Your analogy was just wrong and the literalness was to explain why and just in case you missed something. But it is more like going to a ton of stores then coming back to them later. You were saying you were in the store the whole time and price changes from the shelf to the checkout register. Such a bad analogy. You have at least have to have some time away out of the one store.

Also this is one reason i usually don’t like analogies (note usually) you reduce the facts to assumptions and often over simplification. In this case your & my analogies complicated the simple process.

Get price, go away and decide if accepting it and the price can change between asking and finally going to get it. Happens at the stores, in business when just price checking (no quote) and coming back later to buy/order.

Long term quotes were pursued and the complication made them not so good. One was if too long then the node might have filled up a lot and accepting old quotes (even 5 minutes could be too long) could mean the node overfills trying to honour quotes. And more code to account for that meaning rejecting the quote.

Since you are thinking of larger network then as I have said the price will not change very often since the storage is spread across the network and grows slower than what we’ve seen in the tiny networks. And it will be very rare for nodes to change price twice in a short time. Thus it is in the client’s interest to only get a price check just before uploading so the user can be confident the variance will be small.

And a simple part of the client code can be what overall variance it will accept before either aborting or asking the person whether to proceed or not. But to include that in the node code creates the problems that David pointed out in his post.

3 Likes

Even ignoring the cost of signing… This just kicks the can from now to X time away. (Why 5 minutes and not 10 minutes? Why not one? How would anyone decide on this? Should node operators?). What about if the closest nodes to the data have changed in the meantime? Should a new node be expected to honor an old nodes price?

But why should a node accept a price that they actually don’t want?

I think this is the big difference in thinking perhaps. As things stand nodes don’t need to be held accountable for their price. (Why should they be?)

They either store data or do not. And as soon as we have some fault detection, we’ll be able to weed out nodes not storing data that they should be. That is what we really care about. Not that the price changed over time, that’s actually something we want.

6 Likes

Does it mean, that redundancy will not be constant (5, 8, whatever) throughout the network?

Nevertheless, another problem comes to my mind. It seems to me if we allow for competion between nodes by price, it will be conflicting with a gentle rise of price of fuller nodes. Look at this image:
image
If node [1] has 20 GB left, let’s assume it will charge 10 nanos per GB (or PUT?). But another node [2], will try to charge less to get more uploads, like ~6 nanos with 20 GB left:
image
So, the best strategy would be to have minimum price all along to the limit of storage space, and then just make it higher to effectively reject uploads without being taken as malicious.

There is also a question of target price (that is when node is 100% free, the lower limit). Will it be set by a node operator? This must somehow reflect the cost of storage and node operation. And big, massive node operators will have that cost brought down to minimum. When we add the previous effect (constant price regardless of free space), this leads to centralization of storage in hands of biggest players.

The only entities, that could compete with these, will be zero-target-fee personal enthusiasts donating their free space for the sake of network resilience. Although, I assume there will be just a handful of them.

I have a proposal – If we want to constrain large scale, for profit, centralized hosting, we have to prioritize nodes at home, and even better, make every client a node with some default shared space. All at zero target price.

(perhaps this discussion could be moved to new topic, @moderators ?)

4 Likes

It’s an interesting point you make. Why would a node set it’s price in relation to space left? I don’t think it would, but act in a way you just described, setting the price in relation to competing nodes. Though I think it would try to ask more than just the minimum.

But how are nodes going to set their price? I think a node should set some kind of rejection rate, so that for example 10% of the offers it gets are going to get rejected. It would have a price X for some time, then raise it to price Y, and if 90% of “asking to saves” agree, then it’s OK to keep that raised price.

But if anyone can ask for price check without any commitments, then node does not have a way to tell apart someone who is '“just asking the price” from someone who is going to PUT if price matches. Then the node cannot know if it missed a serious offer because of too high a price, or not, and the method above cannot be applied.

I think either the client or the node should commit to something. Or then there should be some other way to know if a node could charge more or not.

4 Likes

This is the supply demand balance to achieve the cheapest global price for data storage. As the network gets more expensive to Put then more operators run nodes. As the network grows and has more resources it pays out less and data is again cheaper to Put

This balance should provide data storage at a true cost and one that is truly the cheapest unsubsidised cost.

I think of it as this

The price of data storage on SAFE is defined by the amount of tokens the network pays out to operators to keep their computer running. It should be enough to have as many computers running as it needs, but no more. So never too little resource and never too much.

A true open market with no middlemen and one that is the most environmentally favourable.

8 Likes

True, but why would I set my price any different if I my node has 90% empty space or 5% empty space?

Yes, but how do I as a node operator define if the price my node asks is too little or too much to ensure this income?

Too little → I get chunks, but I just don’t earn enough.
Too much → I get good price per chunk, but too few chunks. The problem is that without any commitment, it is hard to find the price people are willing to pay.

2 Likes

Mainly because you are well behaved and following the rules.

This is decided by the code, so you don’t really get to choose. If we made it so you choose then perhaps it all falls apart and if folk want to change their code they are not well-behaved and I would hope the other nodes detect that and ignore it. (Effectively killing that node)

I think they want to pay the best price possible and the network should find that for them

4 Likes

If you have 5% empty space and don’t increase your price, you’ll soon run out of space and then get zero payment from your node, until you add extra space (assuming that’s possible).

So self interest means you’ll increase your price if you’re filling up too fast, and hopefully the price will go up to a level where it’s worthwhile you adding more storage to keep the payments coming.

3 Likes

Node gets paid by PUTs only, right? So why not let my node get full, collect the money, delete the node - and start again?

2 Likes

I don’t know the plans now, but previously I believe the plan was that you would need to behave well for a time before you reached a sufficient node age to then be eligible to earn, and if you scrap the node and start again, you’ll have to start from node age zero.

You’d be better to just add more storage and maintain your age vs resetting the score and not getting paid for a while, effectively providing services for free until reaching the required age again (which would kind of compensate the network for the churn you created).

3 Likes

very keen to learn if this is still the (intended) case

2 Likes

It is nothing like that. The whole process occurs in seconds. Try that with a “ton of stores”. You don’t “visit” or even “call” the nodes (stores) to get their prices in this system, it is done for you.
Whatever.

1 Like

Not sure about this, so fix my thinking if need be.

The data a node is responsible for changes on churn, so does the amount of records it stores not fluctuate.
So getting full is only possible if the network stops growing.

So letting “your” node get full and killing it is not really how it works.

Yes, No… ?

4 Likes

5 minutes was an example. This is childish.

I don’t know, I did not state anything about that. A signed quote is something they agreed to.

You did not read my posts. The topic was why a contracted price could not be implemented so prices could not change during an upload attempt and the client would know the price at the time of the transaction. Dishonoring a contract (i.e. quote) may not be important to you, but it definitely is a sign of bad behavior. The problem here is using signatures is too costly due to the expected high volume.

As David stated with the ant colony analogy, the incidents where accountability is enforced must be minimized to keep throughput high. This is a different paradigm than it was before. I got further behind understanding all the changes to this project than I thought.

Let’s hang anyone with a concern we don’t want to read and try to understand.

1 Like