Problem
When doing proofs in my statistical learning theory course, I found that professor provided the following reasoning $$ \mathbb{E}_\sigma[\Vert\sum_{i=1}^n\sigma_i\mathbf{x}_i\Vert_2]\leq \sqrt{\mathbb{E}_\sigma[\sum_{i=1}^n\Vert\sigma_i\mathbf{x}_i\Vert_2^2]} $$ where $\sigma_i$ is Rademacher random variable, i.e. $\Pr[\sigma_i=1]=\Pr[\sigma_i=-1]=\frac{1}{2}$.
This does not seem to be correct to me since if I take the most simple case, where $n=2$, I have
$$ \begin{aligned} \text{LHS}&=\mathbb{E}_\sigma[\sqrt{\Vert \mathbf{x}_1\Vert_2^2+\Vert \mathbf{x}_2\Vert_2^2+2\sigma_1\sigma_2\mathbf{x}_1^T\mathbf{x}_2}]\\ \text{RHS}&=\sqrt{\Vert \mathbf{x}_1\Vert_2^2+\Vert \mathbf{x}_2\Vert_2^2} \end{aligned} $$ and I may not have any conclusive answer from them.
However, this step is pivotal in deriving meaningful generalization bound, so I think I miss something.
Could someone help me? Any input will be appreciated.
$E\sqrt Y \leq \sqrt {EY}$ and $E\sigma_i \sigma_j =0$ for $i \neq j$ (assuming indepenedence). So your professor is right.