From integral to expected value of probabilities

43 Views Asked by At

I was reading about KL divergence and stumbled upon this post. In the Equation (2), we can see the following:

\begin{align*} D_{KL}(Q||P) &= \int_{-\infty}^{\infty} q(\theta|X) \log\frac{q(\theta|X)}{p(\theta|X)} d\theta \\ &= \int_{-\infty}^{\infty} q(\theta|X) \log\frac{q(\theta|X)}{p(\theta,X)} d\theta + \int_{-\infty}^{\infty} q(\theta|X) \log{p(X)} d\theta \\ &= \int_{-\infty}^{\infty} q(\theta|X) \log\frac{q(\theta|X)}{p(\theta,X)} d\theta + \log{p(X)} \\ &= E_q\left[\log\frac{q(\theta|X)}{p(\theta,X)}\right] + \log p(X) \\ \tag{2} \end{align*}

To go from line (1) to line (2), $p(\theta|X) = \frac{p(\theta,X)}{p(X)}$ and $\log{AB} = \log{A} + \log{B}$ are applied. Then, to go from line (2) to (3): given that $q(\theta|X)$ is a probability, $\int_{-\infty}^{\infty} q(\theta|X) = 1$ is applied. I don't understand how to go from line (3) to line (4). I think the same rule cannot be applied because the other terms are also dependent of $\theta$. Can someone help me to understand why:

\begin{align*} \int_{-\infty}^{\infty} q(\theta|X) \log\frac{q(\theta|X)}{p(\theta,X)} d\theta = E_q\left[\log\frac{q(\theta|X)}{p(\theta,X)}\right] \end{align*}

Any help is appreciated.