Given $\min\limits_x \|x\|_2^2 \quad \text{s.t.} \quad Ax = b$ show $x^* = A^T(AA^T)^{-1}b$

60 Views Asked by At

Given $$\min\limits_x \|x\|_2^2$$ $$\text{s.t.} Ax = b$$ show $x^* = A^T(AA^T)^{-1}b$ where $A \in \mathbb{R}^{m \times n}, m < n$

This is projection $x$ onto the hyperplane $Ax - b = 0$

Multiplying $A^T$ on both sides

$A^TAx - A^Tb = 0$

So we have $\langle A^TAx - A^Tb, x \rangle = 0$

$(A^TAx - A^Tb)^Tx = 0$

$(x^TA^TA - b^TA)x = 0$

$x^TA^TAx = b^TAx = x^TA^Tb$

So $A^TAx = A^Tb$ (circular)

$x = (A^TA)^{-1}A^Tb$ (Wrong)

How can you do this correctly?

2

There are 2 best solutions below

3
On BEST ANSWER

You need the hypothesis that $A$ has rank $m$ so that $AA^T$ is invertible.

Since $x^*=A^T(AA^T)^{-1}b$ we have $Ax^*=b$, so for every $x$ such that $Ax=b$ we have $x-x^*\in {\rm ker}A$ and clearly $x^*\in {\rm Im}A^T$, but it is well-known that ${\rm Im}A^T=({\rm ker}A)^\bot$. Thus, by Pythagoras theorem $$\Vert x\Vert^2=\Vert x-x^*\Vert^2+\Vert x^*\Vert^2\ge \Vert x^*\Vert^2$$ with equality if and only if $x=x^*$.

0
On

It's not clear to me under what principle we could have expected "multiplying $A^T$ on both sides" to lead to a proper answer. For one thing, since $m<n$, $A^TA$ is not invertible. (As it is, the correct answer assumes that $A$ has full row rank so that $AA^T$ is invertible; this is of course not always true.)

The right way to do this is to use a Lagrange multiplier. The Lagrangian is $$L(x,\lambda) = \|x\|_2^2 - \lambda^T ( A x - b )$$ where $\lambda\in\mathbb{R}^m$ is our Lagrange multiplier. Differentiating with respect to $x$ leads to the optimality condition $$2x - A^T \lambda = 0 \quad\Longrightarrow\quad x = \tfrac{1}{2}A^T \lambda$$ Now this value of $x$ must satisfy $Ax=b$, so $$Ax=A \cdot \left(\tfrac{1}{2}A^T\lambda\right) = \tfrac{1}{2}AA^T\lambda = b \quad\Longrightarrow \lambda = 2(AA^T)^{-1}b$$ We can do this if $A$ has full row rank, so $AA^T$ will be invertible. Returning to our formula for $x$, $$x = \tfrac{1}{2}A^T \lambda = \tfrac{1}{2}A^T \cdot \left(2(AA^T)^{-1}b\right) = A^T(AA^T)^{-1}b.$$