I have a habit of understanding everything geometrically. In this case, the victim is probability theory.
I will assume that we are working on $L^2 (\mathcal{F})$, the space of integrable random variables $X^2$, a Hilbert Space. Here we have a well defined inner product given by:
$$<X,Y> = E[XY]$$
Quickly, the inner product is directly related to angles. To be more specific, we have the following formula:
$$\cos (angle(X,Y)) = \frac{<X,Y>}{|X||Y|}$$
Decreasing the length of the vectors does not change the angle. So I will assume for now that I am working with unit vectors, so that:
$$angle(X,Y) = \cos^{-1} (<X,Y>)$$
Thus we can assume that the internal product measures angles.
All right, given all of this, I have some questions
- I can interpret $E(X)= <X,1>$ as the angle between $X$ and $1$, a constant random variable. But I have a hard time relating this geometric interpretation (using angles) to the statistical intuition we learned in school.
The case of covariance is similar. We know that the covariance between X and Y tells us about or indicates the relationship of two variables whenever one variable changes. However, mathematically,
$$Cov (X, Y) = Cov (X - \mu_X ,Y) = E[(X- \mu_X)Y] = Cov(X - \mu_X,Y) = angle(\bar{X},Y)$$ Where $\bar{X} = X - \mu_X$. How can I link the $angle(\bar{X},Y)$ to the classical interpretation of the covariance?
Something more dramatic is the case of variance. we know that variance measures the dispersion of possible outcomes of the random variable $X$. Assuming that $\mu_X = 0$, we have: $$Var(X) = Cov(X,X) = E[X^2] = <X,X> = |X|^2$$ That is, the notion of dispersion is linked to the notion of norm.
Any light on this?