The right way to compute Jensen-Shannon divergence?

828 Views Asked by At

Given two discrete probability distributions:

$$P = [0.5, 0.3, 0.2]$$ $$Q = [0.1, 0.6, 0.3]$$ Mean is given by, $$M = \frac{P +Q}{2}$$ $$=>M = [0.3, 0.45, 0.25]$$

What is the right way to compute Jensen-Shannon divergence:

  1. $$JS(P||Q)=\frac{1}{2}\left[\left\{0.5\times\log\left(\frac{0.5}{0.3}\right)+0.1\times\log\left(\frac{0.1}{0.3}\right)\right\}+\left\{0.3\times\log\left(\frac{0.3}{0.45}\right)+0.6\times\log\left(\frac{0.6}{0.45}\right)\right\}\\+\left\{0.2\times\log\left(\frac{0.2}{0.25}\right)+0.3\times\log\left(\frac{0.3}{0.25}\right)\right\}\right]$$

  2. Alternatively, $$P = [0.5, 0.3, 0.2] = 0.5(1, 0, 0) + 0.3(0, 1, 0) + 0.2(0, 0, 1)$$ $$Q = [0.1, 0.6, 0.3] = 0.1(1, 0, 0) + 0.6(0, 1, 0) + 0.3(0, 0, 1)$$

    $$P = \begin{bmatrix} 0.5 & 0 & 0 \\ 0 & 0.3 & 0 \\ 0 & 0 & 0.2 \end{bmatrix}$$

$$Q = \begin{bmatrix} 0.1 & 0 & 0 \\ 0 & 0.6 & 0 \\ 0 & 0 & 0.3 \end{bmatrix}$$

$$JS(P||Q) = pairwise\_divergence\_between(P, Q)$$

Here, $pairwise\_divergence\_between(P, Q)$ is a divergence matrix obtained by computing divergences between the pairs of rows of matrices $P$ & $Q$.

I know the 1st one is right, but I am considering a case when points in a vector can be represented in n-dimensional space. It is possible to write a vector in standard basis form. So P & Q in the 2nd option above are the 3 components along 3 dimensions. I was looking for a way when one can prove that both methods are equivalent (of course, after doing some calculus in the 2nd option).

1

There are 1 best solutions below

0
On

I figured out that the final output of option $1$ above is equal to the trace of matrix obtained by pairwise_divergence_between(P,Q), i.e., $JS(P||Q)=pairwise\_divergence\_between(P,Q)$ in option $2$ above.

That means the pairwise divergence matrix is a diagonal matrix with off diagonal components being $0$. It means that each diagonal entry represents JS divergence between each component of probability distributions $P$ and $Q$.

Though, I am not sure what each of these diagonal components represent mathematically/geometrically and what the significance is of an individual component.