
Hi! I have recently found a proof for that the correlation must be smaller than $1$.
I have two questions:
Why must the discriminant be negative? It seems that it is related to treating $a$ as unknown and solving the quadratic equation. And looking at the discriminant. But I don't get why the discriminant has to be smaller than or equal $0$?
Why do they use the $\{a(x-\mu_x)+(y-\mu_y)\}^2$ to prove in general that coefficent of correlation is $\leq 1$?
Source: A. Papoulis, "Probability, random variables, and stochastic processes", Third Edition, Chap 7. p. 152-153
In the spirit of simplicity, here is a rewriting of the same proof:
Let $X$ and $Y$ be two random variables, and let $a$ be a positive real number. Now, $aX + Y$ is a random variable, so its variance $V(aX + Y) \ge 0$ (variance is always non-negative). Using the properties of variance:
$V(aX + Y) \ge 0\Rightarrow\\ a^2V(X) + 2a\text{Cov}(X, Y) + V(Y^2) \ge 0$
Now, as the LHS is a quadratic function of $a$ that is always non-negative, its graph is a parabola that lies entirely above the $x$-axis. Thus, it either has complex roots, or has at most one real root, which implies that the discriminant is non-positive:
$[2\text{Cov}(X, Y)]^2 - 4a^2V(X)V(Y) \le 0 \Rightarrow\\ [\text{Cov}(X, Y)]^2 \le V(X)V(Y) \Rightarrow\\ \left[\dfrac{\text{Cov}(X, Y)}{V(X)V(Y)}\right]^2 \le 1 \Rightarrow\\ \rho^2 \le 1 \Rightarrow\\ -1 \le \rho \le 1$