I want to solve the following least squares problem: $$ x = \textrm{argmin} || f(x) ||^2 $$ where $x \in \mathbb{R}^n$, $f(x) \in \mathbb{R}^m$. I know that it is a linear least squares problem, i.e. $$ \frac{\partial f}{\partial x} = A $$ where $A \in \mathbb{R}^{m \times n}$ is not a function of $x$. Assume also that $A$ is full rank so that $x$ is unique. I am given $f$ and $A$ and I need to find $x$. Since it is a linear problem, I know that $$ f = A x + b $$ for some $b$ but I don't know what $b$ is. Using this form, the usual least squares solution is: $$ x = \textrm{argmin} || Ax + b ||^2 = -(A^T A)^{-1} A^T b $$ Inserting $b = f - Ax$ into this gives $$ x = -(A^T A)^{-1} A^T b = -(A^T A)^{-1} A^T f + x $$ which implies that $f = 0$ which is certainly not the case. Can anyone see the error here?
Alternatively, if I plug the solution for $x$ into $f = Ax + b$, I find $$ f = -A (A^T A)^{-1} A^T b + b = \left( I_m - A (A^T A)^{-1} A^T \right) b $$ so that $$ x = -(A^T A)^{-1} A^T \left( I_m - A (A^T A)^{-1} A^T \right)^{-1} f $$ which appears to give me the solution in terms of the known quantities $A$ and $f$. However, for my problem, this approach is not practical since $m >>n$ and the matrix $A (A^T A)^{-1} A^T$ will be too large to fit in my RAM. Does anyone see a clever way to simplify this expression and avoid the need to compute the large $m \times m$ matrix?
The $n \times n$ matrix $A^T A$ is of modest size and can fit in my memory ok.
EDIT: it seems that the matrix $I - A(A^T A)^{-1} A^T$ is singular, so my formula for $x$ won't work.
As I understand it, the $\ f\ $ that you know is the value $\ f=Ax^* + b\ $ at the minimising value $\ x^*\ $, which must satisfy the equation $\ Ax^*=-b^\perp\ $, with $\ b^\perp\ $ the perpendicular projection of $\ b\ $ onto the column space of $\ A\ $. If this is the only function of $\ b\ $ that you know, however, then your minimising value could be any $\ x\in \mathbb{R}^n\ $ so you have no way of determining a minimiser from that information alone.
Let $\ x^\dagger\in \mathbb{R}^n\ $ and $\ \beta=f-Ax^\dagger\ $ then $\ \|Ax^\dagger+\beta\,\|^2= $$\min \|Ax+\beta\,\|^2\ $, and $\ f=Ax^\dagger+ \beta\ $. Since $\ x^\dagger\ $ here can be arbitrary, just knowing $\ f\ $ and $ A\ $ gives you no information about it.