I have a problem where I want to prove that for almost all distributions, applying a transform on the distribution which "favours" large probabilites will result in a lower entropy distribution. It feels like it is really obvious, but I'm having a hard time formalizing a proof. I've tried using the concavity of the entropy, but I'm not quite sure how to take advantage of it.
More formally:
Let: $a,p\in \mathbb{R}^n$ such that $0 < a_1 < ... < a_j < ... < a_n$, $0 < p_1 < ... < p_j < ... < p_n < 1$,$\sum_j p_j = 1$
Let $H[p] = -\sum_j p_j\log(p_j)$
Now let $q_j = \frac{a_jp_j}{\sum_j a_jp_j}$.
Show that $H[q] < H[p]$ for all $p,a$ described above.
Example: $n=2$ $a_1=1/2,a_2=1$, $p_1=1/3,p_2=2/3$.(using $\log_2$)
$H[p] = (\frac{1}{3}\log(3)+\frac{2}{3}(\log(3)-\log(2)) = \log(3)-\frac{2}{3}\log(2) \approx 0.92$
$H[q] = (\frac{1}{5}\log(5)+\frac{4}{5}(\log(5)-\log(4)) = \log(5)-\frac{4}{5}\log(4) \approx 0.72 < H[p]$
EDIT:
My particular problem is in-fact a special case where i use $a_j=p_j$, but I want to have a general solution
Consider the transformation $p_i\to\frac{p_ia_i^x}{\sum_ip_ia_i^x}$ for $x\in[0,1]$. This continuously transforms the one probability distribution into the other. If we can show that the derivative of the entropy with respect to $x$ at $x=0$ is negative, it follows that it is negative for all $x$, since we can rescale to shift any value of $x$ to $0$.
With $\frac{\mathrm d}{\mathrm dx}(f(x)\log f(x))=(1+\log f(x))f'(x)$, we have
\begin{eqnarray} && \left.\frac{\mathrm d}{\mathrm dx}\left(-\sum_i\frac{p_ia_i^x}{\sum_kp_ka_k^x}\log\frac{p_ia_i^x}{\sum_kp_ka_k^x}\right)\right|_{x=0} \\ &=& \left.-\sum_i\left(1+\log\frac{p_ia_i^x}{\sum_kp_ka_k^x}\right)\frac{p_ia_i^x}{\sum_kp_ka_k^x}\left(\log a_i-\frac{\sum_kp_ka_k^x\log a_k}{\sum_kp_ka_k^x}\right)\right|_{x=0} \\ &=& -\sum_i\left(1+\log p_i\right)p_i\left(\log a_i-\sum_kp_k\log a_k\right) \\ &=& -\left(E\left[\log p_i\log a_i\right]-E\left[\log p_i\right]E\left[\log a_i\right]\right) \\ &=&-\operatorname{Cov}(\log p_i,\log a_i)\;. \end{eqnarray}
This covariance is positive, since the $p_i$ and $a_i$ are both in ascending order. This in fact proves the more general result that the entropy is reduced as long as the logarithms of the $p_i$ and the $a_i$ are positively correlated.