Solve the problem of minimizing $f(x) = ||Ax-b||$. Consider all the cases and interpret geometrically.
If we write
$$\|Ax-b\| = (a_{11}x_1 + \cdots + a_{1n}x_n - b_1)^2 + \cdots + (a_{n1}x_1 + \cdots + a_{nn}x_n - b_1)^2$$
then
$$\frac{\partial \|Ax-b\|}{\partial x_j} = 2(a_{11}x_1 + \cdots + a_{1n}x_n - b_1)a_{1j} + \cdots + (a_{n1}x_1 + \cdots + a_{nn}x_n - b_1)a_{nj}$$
If I try to do $\frac{\partial \|Ax-b\|}{\partial x_j} = 0$ I get nothing useful. For $x$ to be a minimizer, I have to have gradient $0$ and hessian positive definite. If we do the hessian just to see:
$$\frac{\partial^2 \|Ax-b\|}{\partial x_k\partial x_j} = 2a_{1k}a_{1j} + \cdots + 2a_{nk}a_{nj}$$
I see nothing useful here.
I think the geometric interpretation comes from the conditions for the gradient to be $0$ and the hessian to be $>0$, but I don't find these conditions useful.
Any ideas?
To expand one of the comments of your question, the norm minimization problem you are considering is a projection problem. Note that projection of a vector $b$ onto a set $S$ is nothing but finding a point $x\in S$ such that the distance between $b$ and $x$ is minimized. Define the projection function as $\Pi_S:\mathbb R^m\to S$. The projection of $x$ onto $S$ is given by $\Pi_S(x)$: $$ \Pi_s(x)=\arg\min_{y\in S}\|x-y\|. $$ Therefore if $S$ is defined to be the range of $A$, that is $$ S=\{Ax: x\in\mathbb R^n\}. $$ Then the problem of minimizing $\|b-Ax\|$, which is a least square problem, can be understood as projecting $b$ onto the rang of $A$, i.e., to find $\Pi_S(b)$.
The projection is given by a linear operator, say $P$. $P$ is symmetric and moreover by definition the projection of each vector $y\in S$ onto $S$ will be itself: $Py=y$. This means that $PA=A$. As an exercise try to see that the projection satisfies the intuitive idea that $Px-x$ should be orthogonal to each vector in $S$ (Hint: $A(I-P)=0$). See the figure below:
The projection matrix is unique (see why!) and therefore for each $b$ there is a unique point on the range of $A$ as the projection of $b$ namely $Pb$.
The projection into the range of $A$ can be found using Moore-Penrose inverse of $A$. The Moore-Penrose inverse of $A$, $A^\dagger$ satisfies among others: $$ AA^\dagger A=A, A^\dagger AA^\dagger=A^\dagger. $$ and $AA^\dagger$ is symmetric. With small efforts one can see that $P=AA^\dagger$. Therefore $z=AA^\dagger b$ gives you the projection of $b$ onto $S$. So we found the point $z$ in the range of $A$ minimizing the distance to $b$. But what about finding $x$ such that $Ax=z$? Well, in general there is no unique solution however we can see that: $$ x=A^\dagger b\implies Ax=AA^\dagger b=z. $$ Hence, $x=A^\dagger b$ is a solution. Moreover for $v$ in the kernel of $A$, $x+v$ is also a solution. The only way for unique solution is for $A$ to have only zero vector in the kernel.
Then how can we find $A^\dagger$? If $A^TA$ is invertible, $A^\dagger=(A^TA)^{-1}A^T$ as you could see in the other answers. Otherwise, there are other methods like using singular value decomposition to find the Moore-Penrose inverse.