We know that
$$\mathbb E[X^2 + Y^2 + 2XY] = \mathbb E[(X+Y)^2] \quad (1)$$
Obviously the function $g(X,Y) = (X+Y)^2$ is not linear. However, I have seen that $(1)$ is equal to $$\mathbb E[X^2] +\mathbb E[Y^2] + 2\mathbb E[XY] \quad (2)$$
But we know that $\mathbb E[g(x,y)] = g(\mathbb E[x,y])$ only if $g$ is linear. For example if $g(X,Y) =X+Y$, then $\mathbb E[X+Y] = \mathbb E[X] +\mathbb E[Y]$ because $X+Y$ is linear. So why does $(2)$ hold?
$E[g(x,y)]\neq g(E[x,y])$ as $g(E[x,y])=E^2[x]+E^2[Y]+2E[x]E[y]\neq E[x^2]+E[y^2]+2E[xy]$, due to the fact that in general $E^2[x]\neq E[x^2]$.
The reason that 2 hold is that the mean operator is linear in the sense that $E[\sum_ia_if_i(X_i)=\sum_ia_iE[f_i(X_i)]$, where $f_i$ refers to some functions of random variables.