Let two points $a,b\in\mathbb{S}^{2}$, where $\mathbb{S}^{2}$ is the two-dimensional simplex in $R^{3}$ with $\sum{}x_{i}=1$ for all $x\in\mathbb{S}^{2}$. $x_{1},x_{2},x_{3}$ are the coordinates of $x$.
The Kullback-Leibler divergence is defined as follows:
$$ D_{\mbox{KL}}(b,a)=\sum_{i=1}^{3}b_{i}\log\frac{b_{i}}{a_{i}}, $$
from now on $\delta(a,b)=D_{\mbox{KL}}(b,a)$. $\delta$ does not obey the triangle inequality. Some basic algebra reveals that for any point $c=\theta{}a+(1-\theta)b$ with $\theta\in(0,1)$ between $a$ and $b$
$$ \delta(a,b)>\delta(a,c)+\delta(c,b) $$
To illustrate, if the Kullback-Leibler divergence were measuring the time it takes you to get from $a$ to $b$, this would be a strange world: the more often you stopped on the way, the shorter your trip would be. I was wondering if by dividing the trip into a lot of these shortcuts you could reduce your travel time as close to zero as you wanted.
Let $c^{jk}$ be an ever more finely grained sequence of partitions $c^{j}$, $j\in\mathbb{N}$, of the line segment from $a$ to $b$ with $c^{jk}$ as dividing midpoints. The first partition is $a=c^{10},c^{11},b=c^{12}$ with $c^{11}=.5a+.5b$, the second partition is $a=c^{20},c^{21},c^{22},c^{23},b=c^{24}$ etc. Now let $T_{j}$ be a sequence defined as follows:
$$ T_{j}=\sum_{k=0}^{2^{j}-1}\delta\left(c^{jk},c^{j(k+1)}\right) $$
Does this sequence tend to zero? Why or why not?