What is constraining the Kullback-Leibler divergence to not be infinitely negative?

Question

What is constraining the Kullback-Leibler divergence to not be infinitely negative?

123 Views Asked by Bumbble Comm At 31 Mar 2026 - 5:48

I'm trying to understand t-SNE, based on this video, and everything is clear except for one key part of the logic right at the time point linked to above: namely, minimizing the Kullback-Leibler divergence.

The stated problem is that we want to choose $ q_{ij} $ such that $$ \sum\limits_{i}\sum\limits_{j \neq i} p_{ij} {\rm log} \left( \frac{p_{ij}}{q_{ij}} \right) $$ is as small as possible, and

$$ q_{ij}=\frac{(1+|y_i-y_j|^2)^{-1}}{ \sum\limits_{k} \sum\limits_{l\neq k} \left( 1+|y_{k}-y_{l}|^2 \right)^{-1}} $$

(I'm a little unsure about those $^{-1}$ exponents in the expression, but I'm copying what's in the video)(Edit: they're right. The quantities are weighted inversely to squared distances)

I understand that $q_{ij}$ is constrained to the range $0..1$, but ... these aren't probabilities (Edit: They actually are referred to as "probabilities of choosing each other as neighbours"), they're just normalized distances, so why is this a sensible comparison? does this just work for any set of numbers that are confined to the 0..1 range and which add up to one?

(I updated this question because I missed something obvious, but I'm still stuck. Thanks for any help you can offer)

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2020-08-20 13:36:01

Hopefully, the following will help:

If $\{a_i\}$ and $\{b_i\}$ are non-negative and $b_i > 0$, then the following holds

\begin{equation} \sum_i a_i \log \left(\frac{a_i}{b_i}\right) \ge \left(\sum_j a_j\right)\log \left(\frac{\sum_i a_i}{\sum_k b_k}\right). \end{equation}

This inequality is called the log-sum inequality and is often encountered when studying information theory and high dimensional statistics.

In your case, $q_{ij}$ is positive and sums to 1, when summed over all $i$ and $j$, $j\ne i$. So it technically is a probability. As for whether RHS is non-negative or not depends on $p_{ij}$. I am assuming it is non-negative and not all zero.

What is constraining the Kullback-Leibler divergence to not be infinitely negative?

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in ENTROPY

Trending Questions

Popular # Hahtags

Popular Questions