I am trying to define a distance $F(X,Y)$ between two multisets $X$ and $Y$. For each pair of $x \in X , y \in Y$ there exists a distance function $f(x,y)$ which takes the range of $[0,1]$. An additional requirement of $F(X,Y)$ is that $F=0$ if one of $X,Y$ is a multiple of the other. Does anyone know if there is some established metric that satisfies these requirements, or maybe provide me with some literature source to begin with?
Thanks a lot!
I'm going to nonstandardly use $\in^\#$ to refer to the version of $\in$ that returns a natural number indicating the multiplicity of an element in a multiset. I will assume that values cannot occur a negative number of times or an infinite number of times.
I am assuming that "is a multiple of" means that there exists some fixed constant $c$ so that, for all $a$, $(a \in^\# X) = c \times (a \in^\# Y)$.
One thing you can do, assuming that $X$ and $Y$ are both finite, is to "flatten" them into probability distributions. A finite multiset can be viewed as a discrete distribution over its elements.
Now that you have two probability distributions, one choice of metric is the earth mover distance. Another choice of metric is the Kolmogorov-Smirnoff test statistic. A third choice is the Kullback-Leibler divergence, but that has the disadvantage of not being symmetric and thus isn't a true metric.