Eigenvalue bound for quadratic maximization with linear constraint

453 Views Asked by At

This builds on my earlier questions here and here.


Let $B$ be a symmetric positive definite matrix in $\mathbb{R}^{k\times k}$ and consider the problem

$$\begin{array}{ll} \text{maximize} & x^\top B x\\ \text{subject to} & \|x\|=1 \\ & b^\top x = a\end{array}$$

where $b$ is an arbitrary unit vector and $a > 0$ is a small positive number. Let $$\lambda_1 > \lambda_2 \geq \cdots \geq \lambda_k > 0$$ be the eigenvalues of $B$ with corresponding eigenvectors $z_1,...,z_k$. I conjecture that the optimal value of the problem is bounded below by $a^2 \lambda_1 + \left(1-a^2\right)\lambda_2$, at least if $a$ is small enough.


To motivate this conjecture, let us consider two special cases. First, suppose that $a= 0$. Then, as was explained to me in one of my previous posts, the optimal value is between $\lambda_1$ and $\lambda_2$ by the Courant-Fischer theorem. Thus, $\lambda_2$ is a lower bound, and it also coincides with my conjectured lower bound in this special case.

Second, let $a > 0$ but suppose that $b = z_i$ for some $i = 1,...,k$. Any feasible $x$ can be written as

$$x = ab + \sqrt{1-a^2} \cdot \hat{b}$$

where $\hat{b}\perp b$. If $b = z_1$, I can take $\hat{b} = z_2$, and if $b = z_i$ for $i \neq 1$, I can take $\hat{b} = z_1$. Either way, the objective value of $x$ is bounded below by $a^2 \lambda_1 + \left(1-a^2\right)\lambda_2$ as long as $a$ is small enough (note that this requires $\lambda_1 > \lambda_2$).

The difficulty is showing that it holds in the case where $b$ is not one of the eigenvectors of $B$ (perhaps with additional restrictions on how large $a$ can be). My intuition is that, if $b$ is not required to be orthogonal to $x$, but only "almost" orthogonal (meaning that $a$ may be required to be sufficiently small), you should be able to go a bit further in the direction of the principal eigenvector than in the case where $a = 0$.


Here is the most up-to-date work on this problem. In the answer below, it was found that the optimal value $v$ of the problem is a generalized eigenvalue of the system

$$PBx = vPx,$$

which in turn was derived from the system

$$PBPy + aPBb = v Py.$$

Any pair $\left(y,v\right)$ that solves these equations then leads to a feasible $x = ab+Py$, with $v$ being the objective value.

We can write

$$\left(vI - PB\right)Py = aPBb.$$

Note that, for any $v$ that is not an eigenvalue of $PB$, the matrix $vI-PB$ is invertible, whence

$$Py = a\left(vI-PB\right)^{-1}PBb.$$

The normalization $x^\top x = 1$ then becomes $y^\top P y = 1-a^2$, leading to the equation

$$\frac{1-a^2}{a^2} = b^\top BP\left(vI-PB\right)^{-2} PBb.$$

The largest root of this equation is the optimal value of the problem. Perhaps, as suggested, it can be found numerically.

2

There are 2 best solutions below

9
On BEST ANSWER

The following analysis explores various approaches to the problem, but ultimately fails to produce a satisfactory solution.

One of the constraints can be rewritten using the nullspace projector of $b$ $$\eqalign{ P &= \Big(I-(b^T)^+b^T\Big) = \left(I-\frac{bb^T}{b^Tb}\right) \;=\; I-\beta bb^T \\ Pb &= 0,\qquad P^2=P=P^T \\ }$$ and the introduction of an unconstrained vector $y$ $$\eqalign{ b^Tx &= a \\ x &= Py + (b^T)^+a \\ &= Py + a\beta b \\ &= Py + \alpha_0 b \\ }$$ The remaining constraint can be absorbed into the definition of the objective function itself $$\eqalign{ \lambda &= \frac{x^TBx}{x^Tx} \;=\; \frac{y^TPBPy +2\alpha_0y^TPBb +\alpha_0^2\,b^TBb}{y^TPy +\alpha_0^2\,b^Tb} \;=\; \frac{\theta_1}{\theta_2} \tag{0} \\ }$$ The gradient can be calculated by a straightforward (if tedious) application of the quotient rule as $$\eqalign{ \frac{\partial\lambda}{\partial y} &= \frac{2\theta_2(PBPy +\alpha_0PBb)-2\theta_1Py} {\theta_2^2} \\ }$$ Setting the gradient to zero yields $${ PBPy +\alpha_0PBb = \lambda Py \tag{1} \\ }$$ which can be rearranged into a generalized eigenvalue equation. $$\eqalign{ PB\left(Py+\alpha_0b\right) &= \lambda Py \\ PBx &= \lambda Px \tag{2} \\ }$$ Note that multiplying the standard eigenvalue equation $$\eqalign{ Bx &= \lambda x \tag{3} \\ }$$ by $P$ reproduces equation $({2})$. So both standard and generalized eigenvalues are potential solutions.

Unlike the discrete $\lambda$ values yielded by the eigenvalue methods, equation $({1})$ is solvable for a continuous range of $\lambda$
$$\eqalign{ y &= \alpha_0(\lambda P-PBP)^+PBb \\ }$$ and produces a $y$ vector which satisfies the zero gradient condition $({1})$.

Unfortunately, none of these approaches yields a solution which satisfies all of the constraints.

But solving equation $(0)$ for an optimal $y$ vector is still the appropriate goal, and requires a numerical rather than an analytical approach.

3
On

I don't think the conjecture is correct. For a counter example, take $B=\begin{pmatrix} 1&0&0 \\0&1&0 \\ 0&0&\varepsilon \end{pmatrix}$ and $b=\begin{pmatrix}0\\ 0\\ 1\end{pmatrix}$. Then the desired maximum is $(1-a^2)+a^2\varepsilon < 1$.