I recall doing an assignment in machine learning where we ran regression tests on a data set, both using our own implemented gradient descent program, and then using the (right) pseudoinverse technique. At least in that particular instance, the latter worked much faster and fit the data much better.
When solving $Ax = b$ via least squares, are there properties of the matrix A that would make the pseudoinverse technique not work? I assume this is the only reason one would resort to iterative techniques.
Iterative techniques can be faster if an approximate solution is known $O(n)$, vs finding the pseudoinverse which is a $O(n^3)$ process. Additionally, though SVD is a stable algorithm even for large $n$, the relative errors for small eigenvalues may be large.