Narrow pipe hash function designs have recently come under fire, particularly in reference to some SHA-3 candidates. Is this criticism valid? Can it be explained more simply than this paper does?
How valid is the concern over narrow pipe cryptographic hash function designs?
174 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
This is a purely theoretical attack that is quite minor and not of any great value in carrying out an actual attack on the security properties of any given hash function.
The explanation involves a bit of math, that while obvious, if explained pedantically using formal mathematics is quite obscure. But, since such pedantry is what is desired on this site, I will give the obscure and strictly correct formal math answer, and then explain it so someone who isn't actually particularly familiar with formal mathematical notation might hope to make sense of it. The original paper, of course, uses formal mathematical reasoning and never really bothers to explain it, which is why it is inaccessible and why I asked this question in the first place.
Personally, I feel that papers that rely on this kind of thing and don't bother to explain it in a way that makes sense to people who don't want to wade through a maze of specialized symbolic notation should be rejected for publication. Then maybe the people who write them would learn to communicate.
So, here's the semi-useless math-out section:
There is a function such that:
$$f(x) \to y$$ $${ X \equiv \{ a \mid a \in \mathbb Z, 0 \leq a < 2^n \} }, { x \in X }$$ $${ Y \equiv \{ a \mid a \in \mathbb Z, 0 \leq a < 2^m \} }, { y \in Y }$$
In an ideal hash function that functions as a random oracle, $f(x)$ maps each individual member of set $X$ onto a completely random member of set $Y$. We'll define $O$ as the range of $f$ like so:
$$O \equiv \{ {y \in Y} \mid {y = f(x)}, x \in X \}$$ $$O \subseteq Y$$
if $n < m$ it is clearly the case that $O \neq Y$. Also, if $n = m$ it will also be true that, in the most probable case, $O \neq Y$ (even though $O \subset Y$). Though as $n$ grows larger than $m$ it becomes increasingly probable that $O = Y$.
It turns out that, on average, $\frac{n(O)}{n(Y)} = {1 - \frac{1}{e}}$ when $n = m$, as is the case in narrow pipe hash functions. ${1 - \frac{1}{e} } \approx 0.632$, so this means that only a little more than half the members of $Y$ will ever be mapped to from $X$.
$$\frac{2^n}{2} = 2^{n-1}$$ $$\frac{2^n}{\frac{1}{1 - \frac{1}{e}}} = {2^{n-\log_2 {1 - \frac{1}{e}}}} \approx {2^{n-0.66}}$$
This means that $\log_2 n(O)$ is approximately 0.66 less than $\log_2 n(Y)$ if $n = m$. If $n$ is much larger than $m$, as is the case in wide-pipe designs, then it's most likely that $O = Y$ so $n(O) = n(Y)$.
Less formally, a narrow pipe design maps an $n$ bit value to another $n$ bit value. If this mapping is a completely random mapping, that means some values will not appear in the output since there will be some output values for which there are multiple input values that map to it.
It turns out that the number of output values is (in the average case) $2^n \cdot (1 - \frac{1}{e})$. Since ${1 - \frac{1}{e}} \approx 0.632$ that means nearly half of the possible output values won't actually happen. This sounds kind of scary until you think about it. If you lose a full half of the output range, that means you've only effectively lost one bit, and since you're losing less than half of the output range you've effectively lost less than one bit. This is very minor.
So, it's a valid attack, yes. But this does not mean it is at all something to worry about.
You're asking whether an ongoing research-level discussion has merits or not. This is difficult to know since experts are actively working on this stuff and you never know what can happen: criticism can fizzle out (as did Courtois's algebraic attacks on AES, after causing a major scare 10 years ago) or pay off (as did Xiaoyun Wang's MD5 attacks).
You want my opinion? I'll give it: existential proofs for hash function security (which is what the cited paper in your question offers) do not hold much sway. Unless you can exhibit an actual attack, I don't think the alarm bells should be ringing. Here's an appropriate example:
Consider some SHA-3 hash function $H$. (For the uninitiated, a cryptographic hash function is a map from a string of any length to one of some specified fixed length; it's a public function; there is no key.) Since $H$ has an infinite domain and a finite range, so by the pigeon-hole principle, there exists a collsion. Therefore $H$ is not collision-resistant and thus it's insecure.
Bogus right? Existential proofs don't mean much in a computational setting. Of course there are collisions, but the hope is that we cannot find them.
That said, there are other papers giving unconvincing attacks against narrow-pipe constructions, so even in the absence of concrete practical attacks, sentiment will grow against the approach even if there will never be any reason to reject it. As sad as it is, cryptography is still partly religious when it comes to designing primitives (meaning we rely a lot on intuition and instinct rather than on proofs, because proofs of security don't exist for hash functions and blockciphers and integer factorization, etc).
Edited to Add: I prefer to remain anonymous here. My PhD is in cryptography, I have 5-6 papers related to cryptographic hashing, and I'm a professor at a "pretty good" place in the U.S. You can read the above opinion with this in mind, or you can discard it as "suspect" and "unsubstantiated." I'll leave that to you.