Solving Quadratic Optimization problems without using Lagrange Multipliers

55 Views Asked by At

I am having trouble understanding min/max problems in Linear Algebra. I am normally used to solving these types of problems using some form of calculus. The question I am stuck on is minimizing:

$$b^{T}x + \frac{1}{2}x^{T}x$$

subject to

$$Ax = 0$$

I think it's a fairly straightforward question when solving it using gradients and lagrange but I am stuck on how to even start questions like this.

1

There are 1 best solutions below

10
On BEST ANSWER

Find a basis for the null space of $A$, then if matrix $N$ has as its columns the vectors of this basis, you can express $x$ as follows

$x = N u $

where $N$ is $n \times k$ and $u $ is a $k \times 1 $ vector. So now the optimization problem becomes unconstrained as follows. Find $u \in \mathbb{R}^k $ such that

$ f = \dfrac{1}{2} u^T N^T N u + b^T N u \hspace{15pt}(1) $

is minimized. And the solution of this problem is well-known. It can be found by completing the square. We want to absorb the linear term $b^T N u = u^T N^T b$, so write $f$ as follows

$ f = \dfrac{1}{2} (u - u_0)^T N^T N (u - u_0) - \dfrac{1}{2} u_0 N^T N u_0 \hspace{15pt}(2) $

Expanding the form we just wrote, we get

$ f = \dfrac{1}{2} u^T N^T N u - u^T N^T N u_0 $

Comparing this to the original form (Eq. $(1)$), we must select $u_0$ such that

$ - u^T N^T N u_0 = u^T N^T b $

Therefore,

$ u_0 = - (N^T N)^{-1} N^T b $

From Eq. $(2)$, and since $N^T N$ as positive semi-definite, the minimum value of $f$ is

$ f_{min} = - \dfrac{1}{2} u_0^T (N^T N) u_0 = - \dfrac{1}{2} b^T N (N^T N)^{-1} N^T b $

The optimal value of vector $x$ is

$ x = N u_0 = - N (N ^T N )^{-1} N^T b $

Note that $f_{min} = - \dfrac{1}{2} b^T x $


Another way to proceed is directly through Lagrange multipliers

The target function to be minimized is

$g (x, \lambda) = \frac{1}{2} x^T x + b^T x + \lambda^T (A x) $

The gradient with respect to $x$ is

$ \nabla_x g = x + b + A^T \lambda $

And the gradient with respect to $\lambda$ is

$\nabla_\lambda g = A x $

From $\nabla_x g = 0 $ , it follows that $ x = - b - A^T \lambda $

From $\nabla_\lambda g = 0 $ , it then follows that

$A (- b - A^T \lambda) = 0 $

Therefore,

$ A A^T \lambda = - A b $

At this point we have to assume that $(A A^T)$ has full rank, (i.e. if $A$ is $m \times n$, then we want the rank of $A A^T$ to be $m$)

Then,

$\lambda = - (A A^T)^{-1} A b = - A^{\#} b $

where $A^\#$ is the Penrose pseudo-inverse of $A$.

And, therefore,

$ x = - b - A^T \lambda = - ( I - A^T (A A^T)^{-1} A ) b $

And finally the minimum value of $f$ is

$ f_{min} = - \frac{1}{2} b^T ( I - A^T (A A^T)^{-1} A ) b $

The advantage of this method is that we can express the minimizing $x$ and $f_{min} $ directly in terms of $A$ and $b$.

Finally, one notes that if $A$ is $m \times n$ where $ m \lt n $ has rank $m$, then $N$ will be of dimension $n \times (n - m)$; therefore given a vector $y \in \mathbb{R}^n$, it can be written as

$ y = [ A^T : N ] [u_1 , u_2 ]^T $

where $u_1 \in \mathbb{R}^m , u_2 \in \mathbb{R}^{n-m} $

And we will have

$(I - A^T (A A^T)^{-1} A) y = [ 0 : N] [u_1 , u_2]^T $

and also,

$ (N (N^T N) N^T ) y = [ 0 : N ] [u_1 , u_2]^T $

Thus both are projectors into the null space of matrix $A$, so they're equal.