I've been told in my geometry lectures (at university) that a way of calculating the projection of a vector onto a subspace is solving a linear system of equations. The only thing I don't know is what is the intuition behind such system. An answer giving a geometrical approach would work for me too!
(Maybe the text down below is a bit long but it's because the question needs a bit of background, I think).
Before that, let $E = [u_1, ..., u_n]$ be a vector space and $F = [v_1, ..., v_r]$ a subspace of $E$. Now, we know we can create the matrix $G$ (sometimes called $M$) in which the element ($g_{ij}$) is the dot product of $v_i$ and $v_j$. (Dot products will be denoted by $<v_i, v_j>$). That means:
$$ G = \begin{pmatrix} <v_1, v_1> & <v_1, v_2> & \cdots & <v_1, v_r> \\ <v_2, v_1> & <v_2, v_2> & \cdots & <v_2, v_r> \\ \vdots & \vdots & \ddots & \vdots \\ <v_r, v_1> & <v_r, v_2> & \cdots & <v_r, v_r> \\ \end{pmatrix} \\ $$
(Note: 1. Because of dot product properties, this matrix is symmetrical. 2. Since all vectors in $F$ are linearly independent, it has an inverse. 3. If the vectors of $F$ are orthogonal the matrix is diagonal. 4. If the vectors of $F$ are orthonomal, de matrix $G$ is simply the Identity matrix. 5. It's called $G$ because it is Gram's matrix, or $M$ because it is called "Metric matrix" too).
Having said that, this matrix is the one that appears in the system of equations that is used to find the projection I was talking about:
Suppose $w \in E$ and $w \not\in F$. We want to know the projection of $w$ onto $F$, that is $\Pi_F (w)$. I know a way to think of it is by using "least squares" (by thinking $G = S^tS$, where $S$ is the matrix that contains the vectors $v_1, ..., v_r$ in its columns), but that's not really how I want to understand it.
I know that the vector space $E$ can be decomposed as $E = F \bigoplus F^\perp$ and I also know that we want to find the linear combination of the vectors of $F$ ($\alpha_1, ..., \alpha_r $) and $F^\perp$ ($\alpha_{r+1}, ..., \alpha_n$) such that $w = \alpha_1 v_1 + \cdots + \alpha_r v_r + \alpha_{r+1} v_{r+1}+ \cdots + \alpha_n v_n$ and just take the first $r$ coordinates, because that will be precisely the projection of $w$ onto $F$. Putting all this together, we see:
$$w = \alpha_1v_1 + \cdots + \alpha_rv_r \space + \space "vector"$$ where "vector" $\in F^\perp$. The next step after getting to this point sounds like "magic" to me: take the dot product of the equation with $v_1$, ..., and the dot product of the equation with $v_r$. Doing this leads us (finally) to the following system of equations:
$ \begin{pmatrix} <v_1, v_1> & <v_1, v_2> & \cdots & <v_1, v_r> \\ <v_2, v_1> & <v_2, v_2> & \cdots & <v_2, v_r> \\ \vdots & \vdots & \ddots & \vdots \\ <v_r, v_1> & <v_r, v_2> & \cdots & <v_r, v_r> \\ \end{pmatrix} \begin{pmatrix} \alpha_1 \\ \vdots \\ \vdots \\ \alpha_r \\ \end{pmatrix} = \begin{pmatrix} <v_1, w> \\ <v_2, w> \\ \vdots \\ <v_r, w> \\ \end{pmatrix} $
And that's what I don't understand. I get it if the basis (of the subspace $F$) is orthogonal or orthonormal, but what do (geometrically) mean the dot products like $<v_i, v_j>$ with $i \neq j$? I've tried to rewrite the system using $<v_i, v_j> = |v_i||v_j|*cos(\alpha)$ and I've noticed that in the i-th equation, $|v_i|$ cancells out, but I coldn't deduce much more... Also, why do you take the dot product? I know it has a lot to do with projections, but I don't know how it... "fits" in here. Is there a way to construct they system of equations more... intuitively? That's the doubt. I hope you understand my question.
Thanks a lot for reading this far!