Which would be the following: $$H\left (p\right ) = \sum_{i=1}^{I} p_{i} \log \frac{1}{p_{i}}=H\left (p_{1}, 1-p_{1}\right )+\left (1-p_{1}\right ) H\left (\frac{p_{2}}{1-p_{1}}, \frac{p_{3}}{1-p_{1}}, \ldots, \frac{p_{I}}{1-p_{1}}\right )$$
With $H$ as the entropy and $p$ as a probability vector.
- The definition of entropy as used here: $\mathrm {H} (X)=-\sum _{i=1}^{n}{\mathrm {P} (x_{i})\log \mathrm {P} (x_{i})}$.
Now I do have a correct proof, but would rather see if my own proof is correct or not, and where the problem is in the latter case.
First, we lay down a few equations which we will be using to complete our proof:
$$\begin{aligned} &\left (1-p_{1}\right ) H\left (\frac{p_{2}}{1-p_{1}}, \ldots, \frac{p_{I}}{1-p_{1}}\right ) \\&= \left (1-p_{1}\right ) \left (H\left (\frac{p_{2}}{1-p_{1}}\right ) + \ldots + H\left (\frac{p_{I}}{1-p_{1}}\right )\right )\\ &= \left (1-p_{1}\right ) \left (\left (\frac{1}{1-p_1}\right ) \left (H\left (P_2\right )+\ldots+H\left (P_I\right )\right ) + H\left (\frac{1}{1-p_1}\right ) \left (p_2+\ldots+p_I\right )\right ) \end{aligned}$$
We know that $\left (p_2+\ldots p_i\right ) = \left (1-p_1\right )$, so continuing the above:
$$\begin{aligned} &= H\left (P_2\right )+\ldots+H\left (P_I\right ) + \left (1-p_{1}\right )\log{\left (1-p_1\right )} \end{aligned}$$
We also have: $$\begin{aligned} H\left (1-p_1\right ) + \left (1-p_{1}\right )\log{\left (1-p_1\right )} &= \left (1-p_1\right )\left (\log\frac{1}{1-p_1}+\log\left (1-p_1\right )\right ) \\ &= 0 \end{aligned}$$
Putting the above together:
$$\begin{aligned} H\left (p_{1}, 1-p_{1}\right )+\left (1-p_{1}\right ) H\left (\frac{p_{2}}{1-p_{1}}, \frac{p_{3}}{1-p_{1}}, \ldots, \frac{p_{i}}{1-p_{1}}\right ) &= H\left (p_1\right )+\ldots+H\left (p_i\right ) \\ &= H\left (p\right ) \end{aligned}$$
Your proof is basically right, but should not use $H()$ for two different things, I'd define instead $g(x)= x log(1/x)$ and so on.
An alternative (more general and perhaps more elegant) proof: consider two random variables $X_1$ $X_2$, taking values on $1,2 \cdots m$, and $m+1, m+2 \cdots n$ with given pmf $p_1(i)$ and $p_2(j)$. We form a third rv $Y=X_1$ with prob $a$, $Y=X_2$ otherwise. This is know as an (non overlapping) mixture.
Then $$H(Z)=h(a) + \alpha H(X_1) +(1-a) H(X_2)$$
where $h(\alpha)= -a \log a - (1-a) \log (1-a)$.
Proof: define the indicator variable $E=1$ if $Y=X_1$, elsewhere $E=2$.
Notice that $H(E|Z)=0$ (knowing $Z$ we know if it came from $X_1$ or $X_2$)
Then $$H(E,Z)=H(Z)+H(E|Z) = H(E)+H(Z|E) \implies H(Z)=H(E)+H(Z|E)$$ But $H(E)=h(a)$ (Bernoulli variable), and $$H(Z|E) = P(E=1) H(Z|E=1) +P(E=2) H(Z|E=2)= a H(X_1) +(1-a) H(X_2)$$
Your equation is a special case, with $m=1$ and $a=p_1$.