Specifically, I am looking at finite subsets of a set that is a discrete metric space under Jaccard Distance. I'm having trouble proving the triangle inequality or coming up with a counterexample.
In the case that it is not, and I'm overlooking some obvious counterexample, does anyone know of a metric that would accurately capture the similarity of these sets (Note, I am NOT looking for Hausdorff distance, as it deals with extrema, which doesn't capture the subtleties of my dataset).
So, if I get you right, for $X$ a metric space with metric $d(x,y)$, define $D(U,V)$ for non-empty finite subsets $U,V\subset X$ by $$ D(U,V)=\frac{1}{|U|\cdot|V|}\sum_{u\in U,\,v\in V}d(u,v). $$ A metric would require $D(U,U)=0$, but if $|U|>1$ we get $D(U,U)>0$, so this is not a metric.
However, you specifically ask if the triangle inequality $D(U,V)+D(U,V)\ge D(U,W)$ is true. It is for the simple reason that $$ D(U,V)+D(V,W)=\tfrac{1}{|U|\cdot|V|\cdot|W|}\sum_{u,v,w}d(u,v)+d(v,w) \ge\tfrac{1}{|U|\cdot|V|\cdot|W|}\sum_{u,v,w}d(u,w)=D(U,W) $$ where $u\in U$, $v\in V$, and $w\in W$.
I don't see any quick fix of this to get a metric with $D(U,U)=0$. If I try defining a new distance function $d_D(U,V)=2D(U,V)-D(U,U)-D(V,V)$, you may get $d_D(U,V)<0$ which is even worse.