Help deciphering Levenshtein formula

263 Views Asked by At

I am trying to completely understand the Levenschtein formula, and I have been reading the Wikipedia article on this. However, the description of the mathematical formula confuses me:

http://en.wikipedia.org/wiki/Levenshtein_distance

It's been a while since I've done real math, but I still remember it.

It's basically describing the Levenshtein distance from word a to b. However, I'm lost as to what $i$ and $j$ represent. As well, I don't understand what the $\max(i,j) , \min(i,j) = 0$ means.

Also, mathematically, what do those curly braces mean? Does it mean the left side is equal to "one of the following"? If so, why is "$\min(i, j) = 0$" written so far to the right of the $\max(i,j)$? Is that just a formatting issue? And what is the "else" doing there? Wouldn't the else be implicit?

1

There are 1 best solutions below

0
On

The Levenshtein distance is computed by filling a matrix $lev_{a,b}$, where $i$ and $j$ denote the matrix indices. The formula describes how the matrix elements are defined. The first row of the formula means that $lev_{a,b}(i,j)$ equals the maximum of $i$ and $j$ if their minimum is zero, i.e. if either $i$ or $j$ is zero. In plain English this simply means that the first column of the matrix is $0,1,2,3,...$ (up to length of string $a$) and that its first row is $0,1,2,3,...$ (up to the length of string $b$). In all other cases (the 'else' in the formula), the matrix elements are defined by the minimum of the 3 expressions given by the formula. As you can see, the elements are defined recursively. After the matrix has been filled, the Levenshtein distance is the lower right matrix element $lev_{a,b}(N_a,N_b)$ ($N_a$ and $N_b$ are the lengths of the strings $a$ and $b$, respectively).