Why are 256bits "enough"?

I am not a cryptography expert, if I’ve gotten any of this wrong, please set me straight and help me understand this.

After reading through the new Safe Network Primer by @anon40790172 and @JPL I noticed that 256 bit addressing was mentioned throughout. In the some of the original safe network documents I see 512 bit addresses mentioned. Considering recent recommendations by security experts for a SHA-384 (or higher) I’m curious how the conclusion that MaidSafe’s original intentions were overkill and that a 256 bit hash “is enough” was reached.

I understand that 256 bits gives a huge number of collision free hashes (an infinity of infinities), and that each set of network objects have their own address space giving SAFE multiple infinities. This size reduction does make a lot of sense for node ids and client ids and such. I guess my only concern relates to the address space for mutable and immutable data chunks; but I do realize that it is probably an all or nothing type affair in order to keep XOR routing simple.

It would seem that the decision on address space bit depth has long term implications, and might require a rather difficult upgrade in the future. Also, because of the marketing optics it seems like some “expert” could criticize SAFE as “not safe enough” regardless of the veracity of the statement and then easily fork the code to a 512 bit network. From the other perspective I can’t foresee a situation where someone would bother forking from an original SAFE network that uses 512 bits to a “safenetwork lite” using 256 bits, nor can I ever really foresee needing more than 512 bits for centuries ( just like 640k ram is all the memory anyone would ever need :wink: ). I’m sure there are probably some performance implications with regard to IOT and mobile if starting with 512 bits, but it would seem that Moore’s law and cell phone asics will address those concerns pretty easily. It also seems like the amount of churn involved with mobile devices / cell phones will make it unlikely that they will be doing that much farming in the first few years anyhow.

So now that I’m done with the long-winded intro to this topic :

“Why are 256 bits enough?”

7 Likes

There was an earlier discussion here, I’m not sure if it is the only one. I suspect not!

1 Like

Since I’m nothing close to someone knowledged about all this stuff I would like to just add a comment by someone in the telegram group


lukors:

People were talking about hash collisions. I found this with a quick online search:

"for all those doubting the security of 2^256 collision chances, there’s the number: There is a 1 in over 115 quattuorvigintillion (that’s a 78 digit number) chance of finding a collision.

This number is bigger than the number of atoms in the perceivable universe. And not by just a little bit either. Exponentially bigger. This number is so big that the human mind can’t comprehend how big it is. It’s just really big. Huge. I can not overstate this enough. This is a very big number. Your financial and cryptographic transactions are secure because of how big this is. Only a fool would attempt to brute force this many possible combinations."

Source: https://learncryptography.com/cryptanalysis/why-is-2-256-secure


And a thought I had:

If you look at https://allbitcoinprivatekeys.com you will find that on the first page a lot of addresses that have been in use (in spite of the incredibly low possibility to find a used private key in general)

As I understand it this is due to the seeds used to generate those first addresses - is that something that can happen at the safe network too? Okay… It’s not collisions but a skewed probability for certain address… What might cause a collision problem…?


Ha @happybeing you’re the best =D I didn’t know how to find a link here - thanks for digging it out!

1 Like

The forum link you posted is a good one. But they are talking about a SAFE network that uses 512bit hashes there, not 256bit.

One good quote from the link you have:

3 Likes

The second last message of Timo is interesting - there he did what we were looking for

2 Likes

Nice find. You beat me to it. Although I’m not sure if @Tim87’s post is enough to say that the “science is settled” on the matter. It would be interesting to know if his discussion/example was instrumental in the decision to switch to 256 bits though…

2 Likes

Thanks for posting the question =) I didn’t find the answer myself and doing stuff like that on mobile just is annoying xD

This article doesn’t take the birthday paradox into account. According to the table here, we only need to generate 4.0 × 1038 SHA-256 hashes to get a 50% probability of collision.

This represent 400 billions billions billions billions hashes. I think this is safe enough, yet this is largely less than the number of atoms in the universe (1080).

4 Likes

Reminds me of something I wrote recently:

something as crushingly undeniable, as the amount of power needed to bruteforce SHA256.

(Something along the line of 4 billion galaxies each with 4 billion earths, each earth with 4 billion people each running a thousand times Google’s worth of servers, during roughly 37 times the age of universe. Neat huh?)

From a Fermi estimate found in this nice video .

4 Likes

A quote from David Schwartz (Chief Cryptographer at Ripple) on stack overflow
"
For practical purposes and the foreseeable future, both SHA384 and SHA512 are good enough for almost any imaginable collision-resistance application. Generally, the primary determining factor of what hash you would use, once we’re above 256 bits and collision resistance is infinite for practical purposes, is just how many bits of output you need. For example, if you need the hash to generate both a 256-bit HMAC key and a 128-bit encryption key, SHA384 is a natural choice. If you need as much output as possible for the computational cost, say for output of a PRNG or as random padding, then SHA512 makes sense.
"

So doesn’t SHA512 make more “sense” for a SAFE network?

1 Like

Bitcoin is “only” 160 bit. And even if all computers would create new bitcoin addresses all day long there wouldn’t be any collisions for hundreds of years. Every added bit doubles the previous amount. So from 160 bit to 161 bit is already a doubling from 160 bit. At 162 bit it’s a doubling from 161 bit etc. So when you’re at 256 bit the number is just extremely big. Think we can assume we’re safe here :+1:

1 Like

Additional info from a dev forum post 4 months ago:

1 Like

@oetyng Thanks for pointing out the background data for the collision probability derivation and the birthday attack. This doesn’t really answer my questions though…

A few remarks:

  • Considering HPC performance and capacity increases alone at the typical 3 orders of magnitude every 10 years gives you a back-of-the-envelope figure that has @hunterlester’s example being reached in about 50 to 60 years, not 4 quintillion.
  • Taking experts at there word shows quantum computing claims that effectively cut the bit depth in half, similar to the birthday attack, so you need SHA-512 to get 256bit security and the comfort level of @anon40790172, but it seems that in some cases SHA-512 security can be further reduced to 170bits.

From searching the forum ( or easily shown in the primer :grinning:) I see that “SAFE uses SHA-3”. A search of the vault code on github confirms sha3_256 is the chosen hash function.
https://github.com/maidsafe/safe_vault/search?utf8=✓&q=sha3&type=


Here too is a nice table on wikipedia that compares different hashing algorithms, showing the need for sha-3 (The whole wiki is a good introductory read…) :


While big numbers are fine and make a point by allowing us to make relative comparisons, I do not think the “big number” argument is really the right way to look at things; since “big numbers” are only “big” relative to the computer processing power you have easy access to or how many degrees you have or how much politically motivated funding you can grab hold of. The big numbers approach just leads to a thought experiment arms race. And yes, I do understand the arguments and silly insults posted online like from the folks at syncthing:

“A mass-murderer space rock happens about once every 30 million years
on average. This leads to a probability of such an event occurring
in the next second to about 10^-15. That’s 45 orders of magnitude
more probable than the SHA-256 collision. Briefly stated, if you
find SHA-256 collisions scary then your priorities are wrong.”

(We’ll need to address that ^^^ issue facing all of us sooner or later as well… :wink: )

We are ants, the future always brings a bigger boot; and near earth asteroids aren’t multiplying and moving faster/closer to earth at a rate by 3 or 4 orders of magnitude every 10 years. The number of atoms in the universe may be finite, but ideas are not. We have no concept of what we will have no concept of 20 years from now; and one of the big bright sunny beautiful ideas in SAFE is the objective to keep data safe “forever”. :sunglasses: :sunrise:


EDIT: Also found some interesting notes here :


Precaution.
I would instead offer a (long term) “precautionary approach” to risk management first, and only then consider any performance implications for that decision (in the short term) as a secondary issue. It’s supposed to be the SAFE Network, not the SPEED Network right? I don’t think this is “going off the deep end” or as “crazy” or “silly” as some might think. It is a situation of constraints on known future safety, in an environment of unknown yet maximal future processing resources. The simple fact is that 4 accepted and established standards exist SHA3-{224, 256, 384, 512} and are available for devs to choose from now. While you can take the view that SHA3-256 is safe, and granted I agree that it is for a long while, maybe long long enough. But the simple truth is that SHA3-512 will always be safer, and a modified SHAKE-2048 safer still; so in my mind the question comes down to how much safeness can be afforded from a hardware performance perspective at genesis on launch day.

It would appear that SHA3-512 isn’t that big of a deal computationally (this post discusses the faster 512bit competitor blake2), and for some processors it is more efficient, taking less than 2x the time of SHA3-256. It also falls in line as a codified and established standard rather than an arbitrary spin of SHAKE-Xbits. The wikipedia table references put median SHA3-512 at 16 cycles/byte vs. 8 for SHA3-256 on Skylake, while an ARM cortex-9 (1ghz) can be around a 120 cycles/byte for SHA3-512 and about 70 cycles/byte for SHA3-256 . It that benchmark performance also plays out in practice, a single amd64 core in a single mid-range 2Ghz processor (which will soon have 8 to 16 cores on average) can completely saturate a 1 Gb network connection. ( EDIT : Just noticed those were software benchmarks. Small dedicated hardware ASICS are doing SHA3 “Keccak” between 15Gb/s and 44Gb/s … so an RPi with something like that might fly. :smile: )

A few questions are raised from these figures that are pertinent to the conversation :

  • What will it take for 1Gb at home to be the standard?
  • How many RPi are really going to run Gig-E to an ISP on launch day when they can’t even saturate a 100Mbit connection with SHA3-256?
  • When wireless Gigabit arrives to mobile for everyone/everywhere, won’t that mobile processor in their pocket be another order of magnitude faster than now?
  • Why skimp? Today’s epyc threadripper is just an RPi in a few years from now, right?

With SHA3-512, you’re basically telling folks that they will need to buy 2 RPi for every one they had planned if you want to keep a 100Mbit connection saturated, but most people don’t have more than a 25Mbit connection so I would think that if an RPi is actually up to the task, it will keep up just fine with either SHA3-256 or SHA3-512. Embedded x86-64 boards make nice low cost options too, at an often better bang for the buck compared to RPi if low power computational efficiency is the main reasoning for reducing hash sizes to 256bits. Lastly,

  • Wasn’t the original plan for SAFE using a 512bit hash for XOR routing? Why change?

“This basically means that we have multiple types in the same space, so message type1 has a 2^512 network, message type 2 has another 2^512 network etc. This can be thought of as each message / data type has its own network.”


That’s about all the rant I can muster for this topic. Apologies if I kept repeating myself, but I bring it up again because it seems like the hash bits issue is the one thing that can’t be easily solved through future updates and gradual network evolution. Since the dev teams are ramping things up for routing, I figured this would be the last time to be able to bring this topic up. Maybe this is a big misconception on my part, so please mention if it is. Always happy to be convinced otherwise… and look forward to learning more insights from others. Maybe it’s time I summon @neo? :blush: Cheers.

4 Likes