I am studying the cauchy schwartz inequality. I understand why it works for standard vector spaces on $\mathbb R^n$.
However I find it hard to intuitively grasp why it works for the expectation operator on a product of random variables. I understand why this is just another example of an inner product, but I don't intuitively understand why its similar to the dot product.
Specifically whats confusing me is the fact that we're not summing two columns of real nimbers as we do with the dot product, not even in the discrete and finite random variable case.
The reason for this is that the posible values of the random variables are first multiplied by their probabilities. But these probabilities are different when we take $E(xx)$ (sum of $x_i^2p_i$ than when we take $E(xy)$ (sum of $x_ip_i*y_jp_j$).
So I'm wondering if there is a way to show the connection between the expectation operator and the dot product operator that makes intuitively clear why the cauchy inequality holds for both of them:
$$E(|xy|)^2\leq E(xx)E(xx)$$
In order to obtain intuition, you can consider the case in which $X$ and $Y$ assume a finite number of values. In this case, there exists a finite sample space $\Omega$ and a probability measure over $\Omega$, $\mathbb{P}$, such that $X: \Omega \rightarrow \mathbb{R}$ and $Y: \Omega \rightarrow \mathbb{R}$. Note that, since $\Omega$ is finite, a random variable $Z: \Omega \rightarrow \mathbb{R}$ is isomorphic to a finite-dimensional real valued vector. Let $<Z_1,Z_2>$ denote the usual dot product. We can obtain \begin{align*} E[|XY|]^2 &= \left(\sum_{\omega \in \Omega} {|X(\omega)Y(\omega)|\mathbb{P}(\omega)}\right)^2 \\ &= \left(\sum_{\omega \in \Omega} {\bigg|\left(X(\omega)\sqrt{\mathbb{P}(\omega)}\right) \left(Y(\omega)\sqrt{\mathbb{P}(\omega)}\right)\bigg|}\right)^2 \\ &= \langle|X\sqrt{\mathbb{P}}|,|Y\sqrt{\mathbb{P}}|\rangle^2 \\ &\leq \langle X\sqrt{\mathbb{P}},X\sqrt{\mathbb{P}}\rangle \langle Y\sqrt{\mathbb{P}},Y\sqrt{\mathbb{P}}\rangle \\ &= E[|X^2|]E[|Y^2|] \end{align*} Note that you were having trouble because you used the law of the unconscious statistician to compute the expectations using the probability mass functions of $X$ and $Y$. This problem is avoided by computing the expectation directly over $\mathbb{P}$.
For continuous random variables, you can use the same idea, that is, \begin{align*} E[|XY|]^2 &= \left(\int_{\mathbb{R}^2}{|xy|f(x,y)d(x \times y)}\right)^2 \\ &= \left(\int_{\mathbb{R}^2}{\bigg|(x\sqrt{f(x,y)})\bigg| \bigg|y\sqrt{f(x,y)}\bigg|d(x \times y)}\right)^2 \\ &= \left\langle X\sqrt{f(X,Y)}, Y\sqrt{f(X,Y)} \right\rangle^2 \\ &\leq \left\langle X\sqrt{f(X,Y)}, X\sqrt{f(X,Y)} \right\rangle \left\langle Y\sqrt{f(X,Y)}, Y\sqrt{f(X,Y)} \right\rangle \\ &= E[|X^2|]E[|Y^2|] \end{align*}