Understanding hash functions.

57 Views Asked by At

What I understand by a Hash function, is a function $H$, such that taking an input $x$ of some bit-length $L$, produces an output $y$ of some bit-length $l$ such that $L >> l$ (where ">>" means that "much more bigger"). So that we have $H(x)=y$.

The doubt is that sometimes I saw that $H$ instead of taking one input $x$, it takes various inputs, say $x_1, x_2, x_3, \dots, x_n$ (without even specifying how many inputs or even changing the number of inputs they are giving to $H$ depending on the situation).

What I think is that they just do $x_1\|x_2\|x_3\| \dots\|x_i = x$ (so they pad all of the $x_i$'s and call $x$ to the result), where $i \in \{2, \dots, n\}$. Is that correct? In that case, just by reordering the $x_i$'s would change the output of the hash function $H$; so the order of the inputs matter.

Also, what it means a base hash (normally denoted by $Q$) and extended base hash (normally denoted by $\overline{Q}$?

All of this is based on the well-known SHA-256.

1

There are 1 best solutions below

2
On

Normally, we donate a hash by $H:\{0,1\}^* \to \{0,1\}^h$ where * is the Kleene star means 0 or more $h$ is the output size like 256 in SHA256

Merkle Damgård (MD) construction: SHA series, expect the SHA3 which is based on Sponge Construction, is based on the Merkle Damgård (MD) construction.

Compression function: In MD, there is a compression function $C$ which is usually a simple block cipher with lots of iteration, that takes the input of the previous value and the current message block as the key, and output is the encryption. $$C:\{0,1\}^{256}\times \{0,1\}^{512}\to \{0,1\}^{256}$$

Padding: Let the message size is $\ell$. For SHA256, the message padded first add 1, then smallest $k$ zero bits, then the 64-bit encoded $\ell$ is added so that $$\ell +1+ k \equiv 448 \bmod 512.$$

Message processing: for each is divided 512-bit parts, each message block is contributed into the hashing as;

$$C_i = C(H_{i-1}, m_{i})$$ and

$$C_0 = (constants, m_{0}).$$ The constants can be found on NIST.FIPS.180-4.

The Hash: $$H(m) = C_{t-1}$$ where $t$ is the number of block of the padded mesage.

In that case, just by reordering the xi's would change the output of the hash function H; so the order of the inputs matter.

Yes, reordering will change the result. However, SHA-1 and SHA-2 are vulnerable to length extension attacks like any MD based hash function.