If we have a linear system $Ax = y$ and $A$ is a nonsingular matrix (let's say it's rectangular), we would still like a matrix $G$ such that we can find $x$ by computing $x = Gy$. We can then call $G$ the generalized inverse of $A$, and this makes sense to me.
The first definition of the generalized inverse of $A$ is then $G$ such that: $\forall y$ in the column space of $A$, $AGy = y$.
However, the generalized inverse is often defined as satisfying $AGA = A$, and the two definitions are supposedly equivalent.
I struggle proving both directions of the equivalence between the two definitions.
If we say that $A$ is $m \times n$, $x$ is $n \times 1$, $y$ is $m \times 1$ and $G$ is $n \times m$, then $AG$ is $m \times m$ and $GA$ is $n \times n$, making them both square matrices, but with no guarantee that they are invertible, so we cannot use that.
If $AGA = A$, take any $y$ in the column space of $A$. Then there is some $x$ such that $y = Ax$. From $AGA=A$, we get $AGAx = Ax$, so $AGy = y$.
If $AGy = y$ for all $y$ in the column space of $A$, then in particular, $AGA_i = A_i$, where $A_i$ is the $i^{\text{th}}$ column of $A$, for each $i$. By the definition of matrix multiplication, $AGA_i$ is also $(AGA)_i$: the $i^{\text{th}}$ column of $AGA$. Therefore $AGA$ and $A$ agree on every column, so $AGA=A$.
(An alternative way to phrase the second direction: if $AGy=y$ for all $y$ in the column space of $A$, then $AGAx = Ax$ for all $x$, or in other words $(AGA-A)x = 0$ for all $x$. But the only matrix $M$ that has $Mx=0$ for all $x$ is the zero matrix, so $AGA-A=0$, or $AGA=A$.)