https://www.cs.cornell.edu/courses/cs312/2008sp/lectures/lec21.html says
For a hash table to work well, we want the hash function to have two properties:
Injection: for two keys k1 ≠ k2, the hash function should give different results h(k1) ≠ h(k2), with probability m-1/m.
Diffusion (stronger than injection): if k1 ≠ k2, knowing h(k1) gives no information about h(k2). For example, if k2 is exactly the same as k1, except for one bit, then every bit in h(k2) should change with 1/2 probability compared to h(k1). Knowing the bits of h(k1) does not give any information about the bits of h(k2).
What is the purpose of the "diffusion" property?
It doesn't reduce collision, correct?
Is "diffusion" needed to increase performance in open addressing hash table? But not in chaining hash table?
Is it needed in cryptographic hash function?
Thanks.
Yes, it is needed for the cryptographic hash functions. It is more knowns as Strict avalanche criterion
Consider that a cryptographic hash function candidate fails this. So, $k_1$ will give information about $k_2$. How one can use this?
Consider it is a 4-bit hash function where the first output bit doesn't change %50 with the first input bit. Let's say the relation is true with %80. Then this will
reduce the timing of the preimage attacks. For random input, we can immediately say that the input can validate or not with at least %80 probability. Or we simply generate only those on searching/attacking. If one can find more biased bits, it causes less time to attack.
And, similarly reducing the timing of the collision attack - first output is 1 then the first input is 1 with %80 probability. Concentrate on the rest...
This is a small demonstration. If there is a bias in the first component, probably more in the others even with some combinations. too. So, hash function designers either test their outputs or use well-established components to achieve this. Merkle-Damgard hash function can use block ciphers that have no related keys ( AES has it, so it is not a hermetic cipher and we don't see AES-based hash function). One can also, rely on the fact that even tiny round functions like TEA block cipher, can be secure with a high number of rounds like the SHA-1,2 series.
In cryptography, a tiny bias can be exploitable, and we don't want it. A very good example is in ECDSA two-bit bias doom!