The proof said to use the quadratic polynomial: $$P(t) = \sum_{i=1}^n(a_it - b_i)^2$$ by which we notice that $P(t) \ge 0$ , but then this is the part which I didn't understand where we conclude that for all $t$: $$P(t) \ge 0$$ so the discriminant should be: $$ D = (-2\sum_{i=1}^na_i\ b_i)^2\ - \ 4(\sum_{i=1}^na_i^2)\ (\sum_{i=1}^nb_i^2) \le 0$$ Unfortunately I tried understanding the reasoning of for all $t$ : $$P(t) \ge 0 \implies D \le 0$$
But I can't seem to get it through my head. I hope someone could explain to me the logic behind this, I'm very interested in understanding it.
There are two possible approaches.
Algebraic approach: Consider a polynomial of degree 2, that is $p=At^2+Bt+C$. It is known (see Quadratic formula) that its roots are $$t_{\pm}=\frac{-B\pm\sqrt{D}}{2A},$$ for $D=B^2-4AC$ the discriminant.
Since $P(t)\ge 0$, then $A>0$ and you cannot have two distinct real roots: otherwise it is easy to see that $P(t)<0$ for $t\in(t_-,t_+)$.
It follows that either you have a single double root, that is $t_-=t_+$, hence by the above equation $D=0$, or two complex root, that is $D<0$.
Geometric approach: consider the function $P(t)=At^2+Bt+C$, with $A>0$, which is a parabola. We look for the minimum of $P$. $$P'(t)=2Ax+B=0\iff t_{\min}=-B/2A.$$
It follows that $$P(t_\min)=P(-B/2A)=\frac{-B^2}{4A}+C=\frac{-B^2+4AC}{4A}=\frac{-D}{4A}.$$ Since $A>0$, $$P(t)\ge0\iff P(t_\min)\ge0\iff -D>0,$$ which proves that $P(t)\ge0$ is equivalent to the non-positivity of the discriminant.