I am currently trying to speedrun my linear analysis course, I have been doing pretty well so far, but hit a wall when the lectures started hitting on SVD and under/overdetermined systems of equations. The set-up is as follows:
Suppose you have an equation of type: $ A\vec{x} = \vec{b} $, where A $ \in \mathbb{R}^{m \times n} $, generally a non-square matrix. For the case that this system is underdetermined, we are trying to find a solution $\vec{x} = \vec{x}_p + Ker(A) $, where $\vec{x}_p$ would generaly be any particular solution, whose "tip" sits at the null space of the matrix, shifted from the origin. In our case, we are trying to at least find the solution with the smallest possible norm, which would be the one perpendicular to the null space, i.e., lying in the row space, or in other words $ \vec{x} = A^T\vec{z} $ for some $\vec{z}$.
Now if we expand our original equation, multiplying it from the left by $A^T$, we get:
$ A^TA\vec{x} = A^T\vec{b} $, where we would usually label $G :=A^TA$ as the Gram matrix (if the inner product of that space is indentical to the dot product, otherwise we would add the matrix representation of the inner product between $A^T$ and $A$).
This is where my problem starts. The set-up further says that if the columns of the Gram matrix are linearly independent, then there exists a matrix $G^{-1}$, giving us (multiplying by the inverse from the left):
$ \vec{x} = G^{-1}A^T\vec{b} $
Using the fact, that $\vec{x} = A^T\vec{z}$, our particular solution looks like:
$\vec{x_0} = A^TG^{-1}A^T\vec{b}$
The problem is, what if the Gram matrix itself ISN'T linearly independent? What is the visual representation of that? And how do we find the "smallest" solution in that case?