What is the probability that the hash of a hash is the same hash?

277 Views Asked by At

Given an $n$-bit uniform cryptographic hash function (128-bit md5, 256-bit sha-256, etc), what is the probability that the input is the same as the output?

Let the hash of $x$ be denoted by $f(x)$. What is the probability that $f(x) = x$, where $f$ is an $n$-bit hash function?

For simplicity, assume that we are using a 128-bit md5 hashing function. The input is limited to the domain of 00000000000000000000000000000000 to FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF (since the domain must be the same as the range otherwise the input cannot possibly be the same as the output). There are $2^{128}$ possible inputs. There are also $2^{128}$ possible outputs. Does this mean that for any given uniform hash function, there is nearly a 100% probability that $f(x) = x$ for a single $x$ in the domain?

1

There are 1 best solutions below

14
On BEST ANSWER

Let $N$ be the number of inputs/outputs (e.g., $N=2^{128}$). Assuming that hashes are uniformly distributed, $\mathbb{P}(f(x)=x)=1/N$ for any given $x$.

The probability of at least one fixed point (i.e., that one or more input maps to itself) is, under the same assumption as above, \begin{align*} \mathbb{P}(\exists x\colon x=f(x)) & =1-\mathbb{P}(\forall x\colon x\neq f(x))\\ & =\textstyle{1-\prod_{x}\mathbb{P}(x\neq f(x))}\\ & =\textstyle{1-\prod_{x}\left(1-\mathbb{P}(x=f(x))\right)}\\ & =\textstyle{1-(1-1/N)^{N}}. \end{align*} This is approximately $1 - 1 / e \approx 0.632$ for $N$ sufficiently large (to see this, take the limit as $N \rightarrow \infty$).