What are the relations between these two minimizations

101 Views Asked by At

What are the relations between the minimization problems $\arg\min_{\mathbf{y}=A\mathbf{x}}\left\Vert \mathbf{x}\right\Vert _{2}$ and $\arg\min_{\mathbf{x}}\left\Vert A\mathbf{x-y}\right\Vert _{2}$ ?

2

There are 2 best solutions below

0
On

No relation, unless $y=Ax$ has a unique solution.

If $y=Ax$ is not solvable, $\arg\min_{\mathbf{y}=A\mathbf{x}}\left\Vert \mathbf{x}\right\Vert _{2}$ does not exist, but $\arg\min_{\mathbf{x}}\left\Vert A\mathbf{x-y}\right\Vert _{2}$ always exists.

If $y=Ax$ has infinitely many solutions, $x_0=\arg\min_{\mathbf{y}=A\mathbf{x}}\left\Vert \mathbf{x}\right\Vert _{2}$ is unique, but $\arg\min_{\mathbf{x}}\left\Vert A\mathbf{x-y}\right\Vert _{2}$ has infinitely many choices. In this case, $x_0$ is among these choices and it is the choice with the shortest length.

If $y=Ax$ has a unique solution $x_0$, we have $\arg\min_{\mathbf{y}=A\mathbf{x}}\left\Vert \mathbf{x}\right\Vert _{2}=\arg\min_{\mathbf{x}}\left\Vert A\mathbf{x-y}\right\Vert _{2}=x_0$.

0
On

Great question! I wouldn't say there is no relation. Those are two quite closely related problems. The problem $$ \min_x \|Ax - y\| $$ is called the linear least-squares problem. Typically, $A$ will have more rows than columns---the problem is then called overdetermined. This problem corresponds to linear regression where you take more measurements than you have unknowns in your linear model. Most of the time, it doesn't make sense for $A$ to have more columns than rows, because then $A$ will have a nullspace and if $y$ lies in the range of $A$, there will be infinitely many solutions, all with objective value zero. It does happen though and is mostly interesting when $y$ is not in the range of $A$.

You can square the norm in the objective without changing the solutions. The optimality conditions of the problem can be written as the normal equations $$ A^T A x = A^T b $$ or as the augmented system $$ \begin{bmatrix} I & A \\ A^T & 0 \end{bmatrix} \begin{bmatrix} r \\ x \end{bmatrix} = \begin{bmatrix} y \\ 0 \end{bmatrix}, $$ where $r := y - Ax$ is the least-squares residual.

When $A$ has more columns than rows however, the interesting problem is the second one: find the $x$ with least norm that lies on the subspace $Ax=y$ $$ \min_x \|x\| \quad \text{subject to} \ Ax=y. $$ Here, you must assume that $y$ is in the range of $A$ or this problem has no solution. Now the optimality conditions of the minimum norm problem are $$ x - A^T z = 0, \quad Ax=y, $$ where $z$ are Lagrange multipliers. They can be rewritten $A A^T z = y$ or $$ \begin{bmatrix} I & A^T \\ A & 0 \end{bmatrix} \begin{bmatrix} x \\ -z \end{bmatrix} = \begin{bmatrix} 0 \\ y \end{bmatrix}, $$ and they are quite similar to those of the least-squares problem.

Evidently, when $A$ is square and $y$ is in the range of $A$, they are the same problem $Ax=y$.

The real connection occurs when the problem is regularized, which you do when $A$ is rank deficient or ill conditioned. The regularized linear least-squares problem is $$ \min_x \|Ax-y\|^2 + \delta^2 \|x\|^2, $$ and the regularized least-norm problem is $$ \min_{x,r} \|x\|^2 + \delta^2 \|r\|^2 \quad \text{subject to} \ Ax + \delta^2 r = y. $$ Now what's interesting is that those two problems are perfectly equivalent: they are the same problem. You can verify that by checking that they have the same optimality conditions. They correspond to the initial problems where you performed the substitutions $$ A \leftarrow \begin{bmatrix} A \\ \delta I \end{bmatrix} \quad y \leftarrow \begin{bmatrix} y \\ 0 \end{bmatrix}. $$ So those problems are quite tightly connected.