Verify that $x = −(I − A^{\dagger}A)c$ optimises $ \frac{1}{2}\|x\|^2 + \langle c,x \rangle$ under constraints

97 Views Asked by At

Under the constraint that $Ax = 0$ and $A$ is full row rank, how would you get the solution to the optimisation problem $$\text{min } \frac{1}{2}\|x\|^2 + \langle c,x \rangle$$

My text says that $x = −(I − A^{\dagger}A)c$ is a unique solution but I am unsure as to how to verify it as I am unsure what a feasible direction is under this constraint.

$A^{\dagger}$ is the Moore Penrose inverse

3

There are 3 best solutions below

2
On

EDIT: Another way to think about this is as follows

Note that

$$\frac{1}{2}\|x-(-c)\|_2^2 = \frac{1}{2}\|x\|^2 + \frac{1}{2}\|c\|^2 + c^Tx.$$

Now we want $x$ to be in the Null space of $A$ and hence we look at the projection of $-c$ onto the null space of $A$.


Let $u$ be the projection of $c$ onto the null space of $A$. Let $v$ be such that $c = u+v$ and $v \in \mathcal{N}(A)^\perp$. Then we have that for any $x \in \mathcal{N}(A)$ we have that

$$c^Tx = u^Tx.$$

Thus our objective is equivalent to

$$\frac{1}{2}\|x\|^2 + u^Tx.$$

Suppose this was unconstrained, then taking the derivative and setting it to $0$, we get

$$ x +u = 0 \Rightarrow x = -u.$$

Now $-u \in \mathcal{N}(A)$, hence this is the constrained solution as well.

Finally note that $(I-A^\dagger A)$ is the projection matrix for $ \mathcal{N}(A)$. Thus,

$$-(I-A^\dagger A)c = -u.$$

0
On

Too long for a comment.

Making the Lagrangian

$$L(x,\lambda)= \frac{1}{2}\|x\|^2 + \langle c,x \rangle-\lambda^TA x$$

the stationary points are the solutions for

$$ x^T+c^T-\lambda^T A=0\Rightarrow x^T = -c^T + \lambda^T A $$

but as $\lambda$ is generic, there exists $B$ generic such that $\lambda^T = c^T B$ and then

$$ x^T = -c^T(I-B A) $$

or

$$ x = -(I-A^TB^T)c $$

so we know the solution structure and we can argue for the solution to

$$ \min_B \frac 12 c^T(I-B A)(I-A^TB^T)c-c^T(I-B A)c $$

0
On

Let $A$ be a $m\times n$ matrix.

Fact 1: If $A$ is full low rank, then $A^\dagger = A^\mathsf{T}(A A^\mathsf{T})^{-1}$. Also, $(I - A^\dagger A)^\mathsf{T}(I - A^\dagger A) = I - A^\dagger A$.

Fact 2: All solutions of $Ax = 0$ are given by $x = (I - A^\dagger A)y$ where $y \in \mathbb{R}^n$ is arbitrary.

Using Facts 1-2, equivalently, we turn to find $y\in \mathbb{R}^n$ such that $$f(y) = \frac12\|(I - A^\dagger A)y\|^2 + \langle c, (I - A^\dagger A)y \rangle$$ is minimized.

Using Fact 1, we have \begin{align*} f(y) &= \frac12 y^\mathsf{T}(I - A^\dagger A)^\mathsf{T}(I - A^\dagger A)y + c^\mathsf{T} (I - A^\dagger A)y \\ &= \frac12 y^\mathsf{T}(I - A^\dagger A)^\mathsf{T}(I - A^\dagger A) y + c^\mathsf{T} (I - A^\dagger A)^\mathsf{T}(I - A^\dagger A)y \\ &= \frac12\|(I - A^\dagger A)y + (I - A^\dagger A)c\|^2 - \frac12 c^\mathsf{T}(I - A^\dagger A)^\mathsf{T}(I - A^\dagger A) c. \end{align*} Thus, any global minimizer of $f(y)$ satisfies $(I - A^\dagger A)y + (I - A^\dagger A)c = 0$ or $(I - A^\dagger A)y = - (I - A^\dagger A)c$.
Thus, $x = - (I - A^\dagger A)c$ is the unique global minimizer of the original problem.