Linear least squares question

109 Views Asked by At

I want to solve the following least squares problem: $$ x = \textrm{argmin} || f(x) ||^2 $$ where $x \in \mathbb{R}^n$, $f(x) \in \mathbb{R}^m$. I know that it is a linear least squares problem, i.e. $$ \frac{\partial f}{\partial x} = A $$ where $A \in \mathbb{R}^{m \times n}$ is not a function of $x$. Assume also that $A$ is full rank so that $x$ is unique. I am given $f$ and $A$ and I need to find $x$. Since it is a linear problem, I know that $$ f = A x + b $$ for some $b$ but I don't know what $b$ is. Using this form, the usual least squares solution is: $$ x = \textrm{argmin} || Ax + b ||^2 = -(A^T A)^{-1} A^T b $$ Inserting $b = f - Ax$ into this gives $$ x = -(A^T A)^{-1} A^T b = -(A^T A)^{-1} A^T f + x $$ which implies that $f = 0$ which is certainly not the case. Can anyone see the error here?

Alternatively, if I plug the solution for $x$ into $f = Ax + b$, I find $$ f = -A (A^T A)^{-1} A^T b + b = \left( I_m - A (A^T A)^{-1} A^T \right) b $$ so that $$ x = -(A^T A)^{-1} A^T \left( I_m - A (A^T A)^{-1} A^T \right)^{-1} f $$ which appears to give me the solution in terms of the known quantities $A$ and $f$. However, for my problem, this approach is not practical since $m >>n$ and the matrix $A (A^T A)^{-1} A^T$ will be too large to fit in my RAM. Does anyone see a clever way to simplify this expression and avoid the need to compute the large $m \times m$ matrix?

The $n \times n$ matrix $A^T A$ is of modest size and can fit in my memory ok.

EDIT: it seems that the matrix $I - A(A^T A)^{-1} A^T$ is singular, so my formula for $x$ won't work.

2

There are 2 best solutions below

0
On

As I understand it, the $\ f\ $ that you know is the value $\ f=Ax^* + b\ $ at the minimising value $\ x^*\ $, which must satisfy the equation $\ Ax^*=-b^\perp\ $, with $\ b^\perp\ $ the perpendicular projection of $\ b\ $ onto the column space of $\ A\ $. If this is the only function of $\ b\ $ that you know, however, then your minimising value could be any $\ x\in \mathbb{R}^n\ $ so you have no way of determining a minimiser from that information alone.

Let $\ x^\dagger\in \mathbb{R}^n\ $ and $\ \beta=f-Ax^\dagger\ $ then $\ \|Ax^\dagger+\beta\,\|^2= $$\min \|Ax+\beta\,\|^2\ $, and $\ f=Ax^\dagger+ \beta\ $. Since $\ x^\dagger\ $ here can be arbitrary, just knowing $\ f\ $ and $ A\ $ gives you no information about it.

0
On

First, lets clarify the notation by noting the difference between $x_{min}$ which is the solution to $x_{min} = \textrm{argmin} || f(x) ||^2$ and $x$ which is just a variable and has no specific value.

Thus the puzzle on the value of $f(x)$ is resolved: you actually have

$x_{min} -x=(A^TA)^{-1}A^Tf(x)$ for arbitrary $x$.

Setting $x=0$ will give you $x_{min}=(A^TA)^{-1}A^Tf(0)$

which is consistent with

$x_{min}=(A^TA)^{-1}A^Tb$

since $b=f(0)$