I have recently studied Interpolation Techniques in my College Numerical Methods class and I have this question: If we have a function $f(x)$ and we are asked to use Least Squares Interpolation(LSI) to derive an approximating polynomial of the nth order, will that polynomial generally intersect the function at n+1 points like if we used Simple Interpolation(Vandermonde for instance)? I experimented a little with this and the experiments say "yes", but I don't know how to go about proving it analytically.
I tried to carry out the proof for a 1st degree LSI polynomial:
let the function we need to approximate be $f(x)$ and the interpolating polynomial $g(x) = ax + b$.
I then routinely minimized the square of the error while varying $a$ and $b$ to get the equation (p and q are the limits of the interval over which I integrated the square of the error):
$$\begin{bmatrix} \int_{p}^{q} x^2dx & \int_{p}^{q} x dx \\\int_{p}^{q} x dx&\int_{p}^{q} dx \end{bmatrix} \begin{bmatrix} a\\b \end{bmatrix} = \begin{bmatrix} \int_{p}^{q} xf(x)dx\\\int_{p}^{q} f(x)dx \end{bmatrix}$$ I then solved this equation to get $a$ and $b$, and consequently got g(x).
I then let $g(x) = f(x)$; or in full form, $$\scriptsize \frac{1}{\int_{p}^{q} dx \int_{p}^{q} x^2dx - \int_{p}^{q} x^2dx}[(\int_{p}^{q} xfdx \int_{p}^{q} dx - \int_{p}^{q} xdx \int_{p}^{q} fdx) *x + \int_{p}^{q} x^2fdx \int_{p}^{q}f dx - \int_{p}^{q} xdx \int_{p}^{q} xfdx] = f $$
Now, is it possible to prove that this equation will have 2 real solutions no matter what function we plug in there?
Yes you are right. The polynomial will have at least $n+1$ intersection points with the original function. Here is the proof (give the credit to my girlfriend).
Assume $p_n(x)=\sum_{i=0}^na_ix^i$ is the polynomial that approximate the original function $f(x)$ in the least square sense by minimising $\int_a^b(f(x) - p_n(x))^2dx$. Define $e(x)=f(x)-p_n(x)$, we want to prove $e(x)$ has at least n+1 roots in the interval $[a,b]$.
Since the coefficients $a_i$s are determined by minimising $\int_a^b(f(x) - p_n(x))^2dx$, $\frac{d}{da_i }\int_a^b(f(x) - p_n(x))^2dx = 0$ for $i = 0,1,2,...,n$. Then we can get $\int_a^bx^i(f(x)-p_n(x))dx=\int_a^bx^ie(x)=0$ for $i = 0,1,2,...,n$. If you know something about polynomial orthogonality, you will find out $e(x)$ is orthogonal to all the polynomial of degree at most $n$ in the interval $[a,b]$, since $\{x^i\}_{i=0}^n$ form a basis.
Next, we prove the statement by contradiction. Assume $e(x)$ has only $m$ roots in the interval $[a,b]$ where $m< n+1$, denoted as $c_1, c_2, ...,c_m$. For simplicity we require these roots are single roots, so they are distinct (I will leave the case of multiple roots as an exercise for you). We construct another polynomial $q(x)=(x-c_0)(x-c_1)...(x-c_m)$ and make $q(x)$ have the same sign as $e(x)$ for all $x$ in the interval $[a,b]$. Then $\int_a^bq(x)e(x)>0$. However, $q(x)$ has degree $m$ which is less than $n+1$ and it can be expressed as an linear combination of $\{x^i\}_{i=0}^n$, so $\int_a^bq(x)e(x)=0$. Thus $e(x)$ must have at least $n+1$ roots in the interval $[a,b]$.
This least square approximation polynomial interpolates the original function at $n+1$ points, but we can't use the simple/Lagrange/Newton interpolation methods to derive it because we have no idea where the interpolation points are.