When it comes to solving and equation containing matrices I don't always understand some of the rules involved. In particular, I am trying to figure out the derivation of the Gauss-Newton algorithm. In this particular case I have to solve for $z$:
$A^TAz = A^Tb$
According to wikipedia, the solution to this is:
$z = (A^TA)^{-1}A^Tb$
I understand what they are doing here, but why can't you do the following?:
$(A^T)^{-1}A^TAz = (A^T)^{-1}A^Tb$
$Az = b$
$z = A^{-1}b$
I have had this question for some time while reading various derivations. What is the underlying rule that prevents me from doing it the second way?