If $g$ is a function of the random variable $X$, is it true that $H(X) = H(X) + H(g(X)\mid X)$?

838 Views Asked by At

I think my homework about entropy is formulated incorrectly.

The question is the following:

let $X$ be a discrete random variable. Show that the entropy of a function $g$ of $X$ is less than or equal to the entropy of $X$ by justifying the following steps:

\begin{align} H(X) & = H(X)+H(g(X)\mid X)\\[8pt] & = H(X,g(X))\\[8pt] & = H(g(X)) + H(X\mid g(X))\\[8pt] & \geq H(g(X)) \end{align}

All the passages seem obvious except the first one. Isn't this entailing that $H(g(X)\mid X) = 0$, since entropy is always non-negative? How can this be a generally valid statement?

All I can think is the following:

$$ H(g(X)\mid X) = \sum_x P(X=x)H(g(X)\mid X=x)\\ = - \sum_{x_1} p(x_1) \sum_{g(x_2)} p(g(x_2)\mid x_1)\log p(g(x_2)\mid x_1) $$

How is this quantity $0$?

2

There are 2 best solutions below

0
On BEST ANSWER

It is because $\mathsf P(g(X)=g(x_2) \mid X=x_1)$ will equal either $1$ or $0$, and since we use the convention that $0 \log 0 = 0$, then it follows that $H(g(X) \mid X) = 0$.

$$\begin{align} \because p(g(x_2)\mid x_1) & = \begin{cases} 1 & : g(x_2)=g(x_1) \\ 0 & : g(x_2)\neq g(x_1) \end{cases} \\[2ex] p(g(x_2)\mid x_1) \log p(g(x_2)\mid x_1) & = 0 \\[4ex] \therefore H(g(X)\mid X) & = \sum_{x_1} p(x_1) H(g(X)\mid X=x_1) \\[1ex] & = - \sum_{x_1} p(x_1)\sum_{x_2}p(g(x_2)\mid x_1)\log p(g(x_2)\mid x_1) \\[1ex] & = 0 \end{align}$$


Remark As MHS points out, this result should match our intuition. If entropy measures disorder, then the entropy of $g(X)$ when conditioned on $X$ should have none such; as $g(X)$ has exactly one value for any given value of $X$.

Knowledge of a random variable entirely determines knowledge of the function of the random variable.

0
On

Think of entropy as randomness or disorder. Now if $X$ is known, then $g(X)$ can be evaluated and thus is also known. Hence the conditional entropy of $g(X)$ given $X$ is zero, as $g(X)$ can be deduced from the already given knowledge of $X$.

Of course if you want the technical proof, the answer below gives the corresponding formulas. However, often it is useful to first get a "feeling" about what entropy means in order to deduce what statements could be true whereas the proofs sometimes do not show this intuition behind the concept of entropy.