I'm trying to understand the following part of the derivation of the KL divergence in variational inference.
\begin{align} & D_\text{KL}[Q(z\mid X) \parallel P(z\mid X)] = \sum_z Q(z\mid X) \log \frac{Q(z\mid X)}{P(z\mid X)} \tag 1 \\[8pt] = {} & \operatorname E\left[\log\frac{Q(z\mid X)}{P(z\mid X)}\right] \tag 2 \\[8pt] = {} & \operatorname E[\log Q(z\mid X) - \log P(z\mid X)] \tag 3 \end{align}
I don't understand how to go from step $(1)$ to step $(2).$ How did the summation in $(1)$ transform to the Expectation seen in $(2)$?
$Q(z \mid X)$ is a PMF.
In general with a PMF $p(z)$ for a discrete random variable $Z$, you can write $$E[g(Z)] = \sum_z g(z) p(z).$$ In your case, $p(z) = Q(z \mid X)$, and $g(z) = \log \frac{Q(z \mid X)}{p(z \mid X)}$.