Minimum quadratic form value within a line?

207 Views Asked by At

If I have $x\in R^n , C\in R^{m\times n}, d\in R^m$, $m<n$, then $Cx=d$ is a linear manifold.

And $P\in R^{n\times n}$, $P>0$, the quadratic form is $y=x^TPx$

Is there an analytical expression of the minimum value of $y$ for $Cx=d$ ?

For $n=3, m=2$, it's like the minimum distance from points on a line to the original point.

I want to use this as a constraint in convex optimization.

2

There are 2 best solutions below

5
On

Apply lagrange multiplier method. The augmented function looks like: $$ f(x)=x^TPx+\lambda^T(Cx-d) $$ Note that here $\lambda\in\mathbb R^m$ is a vector since you have $m$ constraints. The minimum is acquired when: $$ \begin{cases} \partial f/\partial x=2Px+C^T\lambda=0\\ \partial f/\partial\lambda=Cx-d=0 \end{cases} $$ Since $P>0$ and therefore invertible, we have: $$ x=-\frac{1}{2}P^{-1}C^T\lambda\Rightarrow Cx=-\frac{1}{2}(CP^{-1}C^T)\lambda=d $$ Assume $C$ has full row rank, then: $$ \lambda^*=-2(CP^{-1}C^T)^{-1}d $$ Now the argument minimum $x^*$ is given by: $$ x^*=-\frac{1}{2}P^{-1}C^T\lambda^*=P^{-1}C^T(CP^{-1}C^T)^{-1}d $$ and the minimum value is given by: $$ y^*=x^{*T}Px^* $$

0
On

I propose solving your problem iteratively using the primal-dual algorithm of Chambolle & Pock (more precisely, Algorithms 1 & 2 of the paper). This is also an opportunity to familiarize the reader with primal-dual & proximal algorithms, a contemporary technology which is apparently not yet quite popular.

Relaxed asumptions. You only need to assume the weaker condition $P \succeq 0$. If $P \succ 0$, then a quadratic acceleration over the basic algorithm is guaranteed. Most importantly, you don't need any assumptions on $C$ whatsoever.

To begin with, first observe that your problem can be conveniently written in the form \begin{eqnarray} \underset{x \in \mathbb{R}^n}{\text{minimize }}g(x) + f(Cx), \end{eqnarray} where $g: x \mapsto \langle x, Px\rangle$ and $f = i_{\{d\}}: u \mapsto \begin{cases}0, &\mbox{ if }u = d,\\+\infty, &\mbox{ otherwise.}\end{cases}$

If your are a physicist, think of $f$ as an infinite potential well concentrated at $d$. Note that both $g$ and $f$ are proper convex functions. The above problem is the primal form of the following saddle-point (equilibrium) problem \begin{eqnarray} \underset{x \in \mathbb{R}^n}{\text{minimize }}\underset{u \in \mathbb{R}^m}{\text{maximize }}\langle Cx, u\rangle + g(x) - f^*(u), \end{eqnarray} where $h^*:u \mapsto \underset{z \in \mathbb{R}^m}{\text{sup }}\langle z, u\rangle - h(z)$ is the Fenchel-Legendre transform (aka convex conjugate) of a proper convex lower semi-continuous function $h \in \Gamma_0(\mathbb{R}^m)$. In your case, an easy computation reveals $f^*(u) \equiv \langle d, u\rangle$.

You'll need the following ingredients to implement the algorithms of the reference paper.

Proximal operator of $f^*$. Given a scalar $\sigma > 0$, and a point $u \in \mathbb{R}^m$, one computes \begin{eqnarray} prox_{\sigma f^*}(u) := \underset{z \in \mathbb{R}^m}{\text{argmin }}\frac{1}{2}\|z-u\|^2 + \sigma f^*(z) = \underset{z \in \mathbb{R}^m}{\text{argmin }}\frac{1}{2}\|z-u\|^2 + \sigma \langle d, z\rangle = u - \sigma d. \end{eqnarray}

Proximal operator of $g$. Given a scalar $\tau > 0$, one computes \begin{eqnarray} \begin{split} prox_{\tau g}(x) &:= \underset{z \in \mathbb{R}^n}{\text{argmin }}\frac{1}{2}\|z-x\|^2 + \tau g(z) = \underset{z \in \mathbb{R}^n}{\text{argmin }}\frac{1}{2}\|z-x\|^2 + \tau \langle x, Px\rangle\\ &= (I + \tau P^TP)^{-1}x. \end{split} \end{eqnarray}

Remarkably, the above computation can be carried out using the cached SVD (singular-value decomposition of $P$), thus avoiding the matrix inversion. Indeed, let $P = USV^*$ be the SVD of $P$, where $U$ and $V$ are $n$-by-$n$ unitary matrices and $S$ is an $n$-by-$n$ diagonal matrix with nonnegative decreasing diagonal entries $s_1 \ge s_2 \ge ... \ge s_r > 0 = s_{r+1} = ... = s_n$. Then, \begin{eqnarray} (I + \tau P^*P)^{-1}x = (I + \tau V^*S^*U^*USV^*)^{-1}x = (I + \tau V^*S^*SV^*)^{-1}x = V^*DVx, \end{eqnarray}

where $D := diag\left(\frac{1}{1 + \tau s_1^2}, \frac{1}{1 + \tau s_2^2}, ..., \frac{1}{1 + \tau s_r^2}, 1, 1, ..., 1\right) \in \mathbb{R}^{n \times n}$.

BTW, note that $r = rank(P)$.

Square of Lipschitz constant $L$. We have \begin{eqnarray} L^2 := \|C\|^2 = \lambda_{max}(C^*C) = \lambda_{max}(CC^*), \end{eqnarray} which can be computed via the power iteration.

Modulus of strong-convexity of $g$. Notice that if $P \succ 0$, then $g$ is strongly-convex, with modulus of convexity $\gamma = \lambda_{min}(P^*P) = \lambda_{min}(P^2) = \lambda_{min}(P)^2$ = square of the smallest eigenvalue of $P = s_n^2 > 0$.

The algorithm and its rate of convergence. Now apply Algorithm 2 if $P \succ 0$ and Algorithm 1 otherwise, of the reference paper with these settings. According to Theorem 2, the resultant algorithm should converge in $\mathcal{O}(1/\sqrt{\zeta})$ iterations in the former case and $\mathcal{O}(1/\zeta)$ in the latter case, where $\zeta$ is a tolerance on the primal-dual gap.

Further reading. For the kinds of problems you seem interested in (signal recovery, compressed sensing, etc.), you may want to read the following paper by P.L. Combettes.

I hope this was all useful. Let me know if you need more (implementation) details.

Cheers!