Zobrist hashing collision probability 64-bit vs twice 32-bit

539 Views Asked by At

I've been thinking of the right site to post this, finally decided against SO and chess.stackexchange.com, since it's really more a mathematical question.

I'm looking into how position databases in games, specifically chess, work. I found that a frequently used technique is Zobrist Hashing, which seems like a reasonable idea. Most seem to use a size of 64 bits for the hash, which also seems reasonable, as we can expect one collision at around 2^32 positions in the database.

For those not familiar (and I'm leaving out some detail which I don't believe is relevant here), Zobrist hashing assigns a 64-bit hash (randomly chosen) for every piece on every square of the board (there are 6 different pieces in two colors, so 16, on 64 squares, plus a few special hashes to describe the board situation such as whose turn it is, who may still castle, etc., in total around 1000 hashes), and all the hashes that apply to a specific position are bitwise XORed to produce the hash of the position.

Now with such a database, you can search the database for a specific position by computing its hash and looking that hash up in the database.

I now had the idea whether it would be nice to be able to search for positions just from the perspective of one player, e.g. only consider white pieces, not black pieces. My idea is to simply save two hashes of 32 bits each, one only considering the white pieces and one the black pieces.

Those two hashes together are evidently 64 bits again, what I did is to essentially restrict the hashes in the hash table to using only one half of the bits for the hashes concerning white pieces, and the other half for the black pieces, so I'm restricting their domain a bit.

My question now is: Does splitting a hash into two like this increase the likelihood of a collision? If so, by how much?