In probability theory, the conditional expectation $E(X|Y)$ and variance $V(X|Y)$ er usually taken to be random variables, st. the value of $E(X|Y)$ depends on what value $Y$ ends up taking.
I've just started learning information theory, but I get the impression that conditionals are usually "averaged out", st. $H(X|Y)$ really means $EH(X|Y)$. I suppose because it just turns out more practical that way.
Is that a correct distinction to make? Are there ever examples in information theory where the conditional entropy (or divergence etc.) is taken to be a random variable and not "averaged out"?
Taking a quick look at the definition of the conditional entropy, given by, $$H(X|Y ) = \sum_{y } p(y)H(X|Y = y),$$ we can say that $H(X|Y = y)$ is a random variable, but $H(X|Y)$ is a constant. Note that $H(X|Y)$ is the expected value of $H(X|Y = y)$ over $y$.