I am using image hashes to compute the Hamming score. What I now require is to channel this value through a function that yields a similarity score ranging between 0 to 1, where smaller Hamming values are nearer to 1 and larger Hamming values are nearer to 0. My initial method of calculating this is 1 - (Hamming_score/length_of_hash). This is effective only when the Hamming score is less than the length of the hash, which is 16. For Hamming scores greater than 16, I receive values less than 0, which is undesirable. I could rectify this by reversing the formula to 1 - (length_of_hash/Hamming_score), but the results from this formula do not adhere to the condition that larger Hamming scores should have values closer to 0. here is my current function which isnt helping.
def calculate_similarity(hash1, hash2):
hamming_distance = sum(bit1 != bit2 for bit1, bit2 in zip(hash1, hash2))
hash_length = len(hash1)
similarity = 1 - (hamming_distance / hash_length)
# Ensure the similarity score is within the valid range [0, 1]
similarity = max(0, min(1, similarity))
return similarity
Any help is appreciated.