In my information theory class I need to prove that entropy is concave (which is usually done with Jensen's inequality). But I want to use only the definition of entropy. And as the result of derivations I get a wrong answer.
Here what I do:
I need to prove that:
$$\lambda {\rm H} \left(p\right)+\left(1-\lambda \right){\rm H} \left(q\right)\le {\rm H} \left(\lambda p+\left(1-\lambda \right)q\right)$$
I use the definitions of entropy (with the summation carried out over $\textit{p}$ or $\textit{q}$, or ($\textit{p}$ and $\textit{q}$)):
$$\lambda {\rm H} \left(p\right)={\rm {\mathbb E}}_{p} \log \frac{1}{p^{\lambda } } ={\rm {\mathbb E}}_{p,q} \log \frac{1}{p^{\lambda }}$$
$$\left(1-\lambda \right){\rm H} \left(q\right)={\rm {\mathbb E}}_{q} \log \frac{1}{q^{\left(1-\lambda \right)} } ={\rm {\mathbb E}}_{p,q} \log \frac{1}{q^{\left(1-\lambda \right)} } $$
$${\rm H} \left(\lambda p+\left(1-\lambda \right)q\right)={\rm {\mathbb E}}_{p,q} \log \frac{1}{\lambda p+\left(1-\lambda \right)q} $$
Then collecting everything and moving to the left-hand side:
$${\rm {\mathbb E}}_{p,q} \log \frac{1}{p^{\lambda } } +{\rm {\mathbb E}}_{p,q} \log \frac{1}{q^{\left(1-\lambda \right)} } ={\rm {\mathbb E}}_{p,q} \log \frac{1}{p^{\lambda } q^{\left(1-\lambda \right)} } $$
$${\rm {\mathbb E}}_{p,q} \log \frac{1}{p^{\lambda } q^{\left(1-\lambda \right)} } -{\rm {\mathbb E}}_{p,q} \log \frac{1}{\lambda p+\left(1-\lambda \right)q} \le 0$$
Using log properties:
$${\rm {\mathbb E}}_{p,q} \log \frac{\lambda p+\left(1-\lambda \right)q}{p^{\lambda } q^{\left(1-\lambda \right)} } \le 0$$
So I have to prove that the ration under log is less than unity (for log to be negative):
$$\frac{\lambda p+\left(1-\lambda \right)q}{p^{\lambda } q^{\left(1-\lambda \right)} } \le 1$$
Finally I get the inequality which is definitely wrong (since AM-GM says me so):
$$\lambda p+\left(1-\lambda \right)q\le p^{\lambda } q^{\left(1-\lambda \right)} $$
So, where I am wrong?
This is just answering what went wrong (not using your chain of ideas for indeed proving concavity of entropy).
You have abandoned the expectation probabilities. Correctly, ${\rm {\mathbb E}}_{p} F(p) = \sum_{x_i} p(x_i) F(p(x_i))$.
1) As a first example, let $\lambda = \frac12$, $p=1$, $q=0^+$. Then, applying your ideas (without expectation probabilities) to the very first line, you get
$$ \lambda {\rm H} \left(p\right)+\left(1-\lambda \right){\rm H} \left(q\right)\le {\rm H} \left(\lambda p+\left(1-\lambda \right)q\right) \\ \leftrightarrow\\ \frac12 \log \frac{1}{1} + \frac12 \log \frac{1}{0^+}\le \log \frac{1}{\frac12 \cdot 1 + \frac12 \cdot 0 } = \log 2 $$ which is of course wrong. The correct treatment, including the expectation probabilities, is $$ \frac12 \cdot 1 \cdot \log \frac{1}{1} + \frac12\cdot 0^+ \cdot \log \frac{1}{0^+}\le \frac12\log 2 $$ and this is correct, since $0^+ \cdot \log \frac{1}{0^+} \to 0^+ $.
2) In a general setting which you want to pursue, putting the expectation probabilities in, you have (where for short, ${\rm {\mathbb E}} f(p) = \sum_{x_i} f(p(x_i))$ is used), $$ {\rm {\mathbb E}}_{p,q} \log \frac{1}{p^{\lambda } } +{\rm {\mathbb E}}_{p,q} \log \frac{1}{q^{\left(1-\lambda \right)} } = {\rm {\mathbb E}}\left[ p \log \frac{1}{p^{\lambda } } + q \log \frac{1}{q^{\left(1-\lambda \right)} }\right] $$ and, somewhat more involved, $$ {\rm {\mathbb E}}_{p,q} \log \frac{1}{\lambda p+\left(1-\lambda \right)q}={\rm {\mathbb E}}\left[({\lambda p+\left(1-\lambda \right)q})\log \frac{1}{\lambda p+\left(1-\lambda \right)q}\right] $$ which leaves you to show $$ p \log \frac{1}{p^{\lambda } } + q \log \frac{1}{q^{\left(1-\lambda \right)} }\le ({\lambda p+\left(1-\lambda \right)q})\log \frac{1}{\lambda p+\left(1-\lambda \right)q} $$ Continuing as you have laid it out would result in having to show $$ \frac{({\lambda p+\left(1-\lambda \right)q})^{({\lambda p+\left(1-\lambda \right)q})}}{p^{\lambda p} \cdot q^{\left(1-\lambda \right) q}} \le 1 $$ but I do not think this will lead us somewhere.
3) As a third piece of explanation which shows the effects of the above error versus the correct treatment, just notice that $\log (1/x)$ is convex whereas $x \log (1/x)$ is concave ... this says it all.