Finding similarity of strings using distance function : Bounding the distance function?

134 Views Asked by At

I want to know if 2 binary strings $s$ and $t$ each of $d$ length (dimension) and N = 2 (the alphebet) in this case 0 and 1 are similar to each other or not using the following distance function where $k$ denotes the iteration number or the time index

enter image description here

Using programming language, I implemented the formula but I am unable to decide if there should be a threshold so that distances less than the threshold are considered similar to each other. The QUestion stems from the tutorial https://www.math.ubc.ca/~andrewr/620341/pdfs/symb_sum.pdf which mentions another distance function. For both of the distance metric, what is the usual practice of deciding the bound $\epsilon$ in the condition $d(s,t) < \epsilon$

I am having a tough time in understanding the document and not sure if the Authors mean that 2 strings are similar if their distance is less than $1/2^n$ where $n$ is the number of iterations / length of the vector. Shall be greateful for an easier explanation.

How to find the minimum and maximum distance bound? Thank you

1

There are 1 best solutions below

2
On

It looks like that document considers strings of infinite length. So the theorem says that if two strings agree on the first $n$ characters, their distance is less than $1/2^n$. If you're dealing with finite strings of length $n$, the smallest distance possible between two distinct strings is $1/2^n$, so in this case if the distance between two strings is less than $1/2^n$, the two strings are the same.