In the derivation of the KL divergence, where does the expectation come from?

99 Views Asked by Bumbble Comm At 29 Mar 2026 - 4:34

I'm trying to understand the following part of the derivation of the KL divergence in variational inference.

\begin{align} & D_\text{KL}[Q(z\mid X) \parallel P(z\mid X)] = \sum_z Q(z\mid X) \log \frac{Q(z\mid X)}{P(z\mid X)} \tag 1 \\[8pt] = {} & \operatorname E\left[\log\frac{Q(z\mid X)}{P(z\mid X)}\right] \tag 2 \\[8pt] = {} & \operatorname E[\log Q(z\mid X) - \log P(z\mid X)] \tag 3 \end{align}

I don't understand how to go from step $(1)$ to step $(2).$ How did the summation in $(1)$ transform to the Expectation seen in $(2)$?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 14 Nov 2019 - 5:04 BEST ANSWER

$Q(z \mid X)$ is a PMF.

In general with a PMF $p(z)$ for a discrete random variable $Z$, you can write $$E[g(Z)] = \sum_z g(z) p(z).$$ In your case, $p(z) = Q(z \mid X)$, and $g(z) = \log \frac{Q(z \mid X)}{p(z \mid X)}$.

In the derivation of the KL divergence, where does the expectation come from?

There are 1 best solutions below

Related Questions in INFORMATION-THEORY

Trending Questions

Popular # Hahtags

Popular Questions