I think my homework about entropy is formulated incorrectly.
The question is the following:
let $X$ be a discrete random variable. Show that the entropy of a function $g$ of $X$ is less than or equal to the entropy of $X$ by justifying the following steps:
\begin{align} H(X) & = H(X)+H(g(X)\mid X)\\[8pt] & = H(X,g(X))\\[8pt] & = H(g(X)) + H(X\mid g(X))\\[8pt] & \geq H(g(X)) \end{align}
All the passages seem obvious except the first one. Isn't this entailing that $H(g(X)\mid X) = 0$, since entropy is always non-negative? How can this be a generally valid statement?
All I can think is the following:
$$ H(g(X)\mid X) = \sum_x P(X=x)H(g(X)\mid X=x)\\ = - \sum_{x_1} p(x_1) \sum_{g(x_2)} p(g(x_2)\mid x_1)\log p(g(x_2)\mid x_1) $$
How is this quantity $0$?
It is because $\mathsf P(g(X)=g(x_2) \mid X=x_1)$ will equal either $1$ or $0$, and since we use the convention that $0 \log 0 = 0$, then it follows that $H(g(X) \mid X) = 0$.
$$\begin{align} \because p(g(x_2)\mid x_1) & = \begin{cases} 1 & : g(x_2)=g(x_1) \\ 0 & : g(x_2)\neq g(x_1) \end{cases} \\[2ex] p(g(x_2)\mid x_1) \log p(g(x_2)\mid x_1) & = 0 \\[4ex] \therefore H(g(X)\mid X) & = \sum_{x_1} p(x_1) H(g(X)\mid X=x_1) \\[1ex] & = - \sum_{x_1} p(x_1)\sum_{x_2}p(g(x_2)\mid x_1)\log p(g(x_2)\mid x_1) \\[1ex] & = 0 \end{align}$$
Remark As MHS points out, this result should match our intuition. If entropy measures disorder, then the entropy of $g(X)$ when conditioned on $X$ should have none such; as $g(X)$ has exactly one value for any given value of $X$.
Knowledge of a random variable entirely determines knowledge of the function of the random variable.