Expected value of sample entropy of a dice

96 Views Asked by At

A dice has $K$ sides, each side has probability $p_k$ of coming up in a roll, such that $p_k \geq 0$ and $\sum_{k=1}^K p_k = 1$. Let $X$ denote the random variable corresponding to such a probability distribution. Then the entropy of $X$ is given by

$$H(X) = -\sum_{k=1}^K p_k \log_2 p_k $$

Now, lets say we have $N$ i.i.d samples $x_i$ of the random variable $X$, but we don't know $p_k$. We would like to make a guess at the entropy of the original random variable. To do so, we will first estimate the original probabilities

$$\hat{p}_k = \frac{1}{N}\sum_{i=1}^N I(x_i = k)$$

Where $I$ is the indicator variable (1 if argument is true, 0 otherwise). Since each summand is a bernoulli random variable, the marginal distribution of each of the estimators $\hat{p}_k$ is just the binomial distribution

$$\hat{N}_k = N\hat{p}_k \sim Bin(p_k, N)$$

Indeed this seems to be a reasonable estimate, since the expected value of the binomial distribution is $Np_k$, so $\hat{p}_k$ asymptotically approaches $p_k$.

Now we are ready to make a guess at the entropy. Our naive guess will be

$$\hat{H}(X) = -\sum_{k=1}^K \hat{p}_k \log_2 \hat{p}_k$$

I would like to know how good is this guess. To do that, it would make sense to estimate if it has an asymptotic bias. Hence, to attempt to find its expected value. It may be useful to rewrite it in terms of $\hat{N}_k$, since we know its distribution.

$$\hat{H}(X) = -\sum_{k=1}^K \frac{\hat{N}_k}{N} \log_2 \frac{\hat{N}_k}{N} = \log_2 N -\frac{1}{N}\sum_{k=1}^K \hat{N}_k \log_2 \hat{N}_k $$

To find the expected value $\langle \hat{H}(X) \rangle$, by linearity, it should suffice to know the expected value $\langle \hat{N}_k \log_2 \hat{N}_k \rangle$.

This is where I am stuck. How do I find the expected value of $\langle \hat{N}_k \log_2 \hat{N}_k \rangle$, where $\hat{N}_k$ has a binomial distribution with probability $p_k$ and quantity $N$? It sounds like basic math, but my naive attempts to do so quickly drown in infinite size combinatorial expressions.

1

There are 1 best solutions below

1
On

Hint: Using the Law of total probability: $$\mathbb E \left(\hat{N_k} \log_2 \hat{N_k}\right) = \sum_{n=0}^{N} n \log_2 n \cdot \mathbb P(\hat{N_k} \log_2 \hat{N_k} = n \log_2 n)$$ $f(x) = x \log_2 x$ is a strictly monotone if $x\ge1$. Therefore $$P(\hat{N_k} \log_2 \hat{N_k} = n \log_2 n) = \mathbb P(\hat{N_k} = n) ~,~ (\forall n \ge 1) $$