Suppose $Z$ takes values in {0, 1, 2, · · · }. Given $E[Z] = a$ for some $a > 0$, find the probability mass function $p_i = P(Z = i$) that maximises $H(Z)$
This reduces to an optimisation problem I think, but I'm not sure how to go about solving it.
Suppose $Z$ takes values in {0, 1, 2, · · · }. Given $E[Z] = a$ for some $a > 0$, find the probability mass function $p_i = P(Z = i$) that maximises $H(Z)$
This reduces to an optimisation problem I think, but I'm not sure how to go about solving it.
The method of Lagrange multipliers is the way to go. In the present case, the entropy function is given by $H[Z] = \displaystyle -\sum_k p_k \ln p_k$. Moreover, as $Z$ is a probability distribution (hence normalization) with a known mean $a$, this optimization is made under the two following constraints : $$ (1)\;\,\sum_k p_k = 1 \quad\quad\&\quad\quad (2)\;\; \mathbb{E}[Z] = \sum_k kp_k = a $$ In consequence, we need two multipliers and the associated Lagrangian takes the form : $$ L = -\sum_k p_k \ln p_k + \lambda \left(1 - \sum_k p_k\right) + \mu \left(a - \sum_k kp_k\right) $$ The corresponding Euler-Lagrange equation is thus given by $$ 0 = \frac{\partial L}{\partial p_k} = -(1 + \ln p_k) - \lambda - \mu k \quad \forall k = 0,1,2,\ldots $$ hence $p_k = e^{-\mu k - \lambda - 1}$. The Lagrange multipliers are determined with the help of the constraints : $$ \sum_k p_k = e^{-\lambda-1} \sum_k e^{-\mu k} = \frac{e^{-\lambda-1}}{1-e^{-\mu}} = 1 $$ and $$ \sum_k kp_k = e^{-\lambda-1} \sum_k ke^{-\mu k} = -e^{-\lambda-1} \frac{\partial}{\partial\mu} \sum_k e^{-\mu k} = -e^{-\lambda-1} \frac{\partial}{\partial\mu} \frac{1}{1-e^{-\mu}} = e^{-\lambda-1} \frac{e^{-\mu}}{(1-e^{-\mu})^2} = a $$ where we assumed $\mu > 0$. The solutions to these equations are $\mu = \ln(1+\frac{1}{a})$ and $e^{-\lambda-1} = 1-e^{-\mu} = \frac{1}{1+a}$, hence finally $$ p_k = \frac{1}{1+a} \exp\left(-k\ln\left(1+\frac{1}{a}\right)\right) = \frac{1}{1+a} \left(\frac{a}{1+a}\right)^k \quad \forall k = 0,1,2,\ldots $$