In the usual Shannon's source coding problem one chooses code words that minimize $E[L]:=\sum_i p_il_i$ over all $L=(l_1,l_2, \dots), l_i\ge 0$ such that $\sum_i e^{-l_i}\le 1$ (Kraft inequality), where $p_i$ are the probabilities of the letters.
The answer to this problem is $l_i=-\log p_i$ for all $i$.
Campbell generalized this to a problem where one minimizes $C[L]:=\frac{1}{t}\log\sum_i p_i e^{t l_i},t>0$, instead of the expected length, subject to the same Kraft inequality. Remember that the $l_i$'s which minimize $C[L]$ are the same as the ones which minimize just $\sum_i p_i e^{t l_i}$.
The solution to this problem is $l_i = -\log(p_i^{\alpha}/\sum_jp_j^{\alpha})$. Although this solution is completely different from that of the usual source coding theorem, I heuristically feel why don't they have the same solution. After all, the only difference is that in the second case, the lengths are just exponentiated in the objective function. So, I somehow feel that the $l_i$'s which minimize $\sum_i p_il_i$ should also minimize $\sum_i p_ie^{t l_i}$ (also since the set of all possible $L$'s for both the problems are the same).
I know that my heuristic is wrong. I know that the solutions to both problems are different. Can someone enlighten me?
Well, you "feel" that, given $x_i^*$ that minimizes $\sum p_i x_i$ subject to some restriction (in our case, $x_i\ge 0$ and $\sum 2^{-x_i}\le 1$ - but that's not essential) then the same $x_i^*$ minimizes $\sum p_i g(x_i)$, where $g(\cdot)$ is any non-decreasing function. That's obviously false.
An example: $p_i=1$, and the restrictions $x_i\ge 0$, $\sum x_i^2 = 1$. Then the objective $\sum x_i$ is minimized by $x=(1,0\cdots)$, while the alternative objective $\sum x_i^3$ is minimized by uniform $x_i$.