I have a question regarding definition of entropy by expected value of the random variable $\log \frac{1}{p(X)}$:
$H(X) = E \log \frac{1}{p(X)}$,
where $X$ is drawn accordingly to the probability mass function of $p(x)$.
The problem is I am still don't understand two thing:
1) How this formula was derived from the original formula for the entropy
$H(X) = - \sum_{x \in X} p(x) \log p(x)$.
2) Even without knowing how to derive the second formula, what it the meaning of $p(X)$? Can you show how to find an entropy for the fair dice with one toss by the second formula.
Appreciate your help!
1) Suppose $X$ is a random variable that takes only a finite number of values, say: $x_1$ with probability $p_1$, $x_2$ with probability $p_2, \ldots, x_n$ with probability $p_n$.
What is the expectation of $X$? Well, easy, $E(X) = \sum_{i=1}^n x_i p_i.$ What about the random variable $X^2$? Well, $E(X^2) = \sum_{i=1}^n x_i^2 p_i$.
The same goes for the random variable $p(X)$. You can think of it as taking the value $p_i$ whenever $X$ takes the value $x_i$. Therefore its expectation is $E(p(X)) = \sum_{i=1}^n p_i^2$.
Finally, the random variable $\log\left(\frac{1}{p(X)}\right) = - \log(p(X))$ has expectation $$E\left[\log\left(\frac{1}{p(X)}\right)\right] = E[-\log(p(X))] = \sum_{i=1}^n (- \log(p_i))p_i = -\sum_{i=1}^n p_i \log(p_i) $$
2) For instance, for the fair dice, $p(1) = p(2) = \ldots = p(6) = 1/6$, hence $$H(X) = - 6 \cdot \frac{1}{6}\log\left(\frac{1}{6}\right) = \log(6)$$