Kullback Leibler (KL) Divergence (In the Context of `t-SNE`)

189 Views Asked by Bumbble Comm At 11 May 2026 - 5:26

KL Divergence (T-Sne) I have $$ C = \sum_{i} KL(P_i ||Q_i) = \sum_{i} \sum_{j} \mathbf{p}_{(j|i )} \log \frac{\mathbf{p}_{(j|i)}} {\mathbf{q}_{(j|i)}}$$

This is your basic KL divergence.$P_i$ represents the conditional probability distribution over all given data-points given $x_i$ , and $Q_i$ represents the conditional probability distribution over all map points given $y_i$. I understand that this is asymmetric.

I am not able to understand this line "In particular, there is a large cost for using far map points to represent data-points that are close (i.e, using a small qj|i to model a large pj|i)." can anybody help in understanding this statement. Here is the link to full paper pg 2

Original Q&A

There are 1 best solutions below

Bumbble Comm On 25 May 2018 - 1:47

I took a quick look at the paper. I think that sentence is helping you read the formula by pointing out that when $p_{j|i} >> q_{i|j}$ the term with $\log(p/q)$ contributes a lot to the $KL$ metric but the term with $\log(q/p)$ contributes very little. It explains in words what that asymmetry means in this particular application.

Kullback Leibler (KL) Divergence (In the Context of `t-SNE`)

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in MACHINE-LEARNING

Related Questions in INFORMATION-THEORY

Trending Questions

Popular # Hahtags

Popular Questions