Optimization+Geometry Question (hyperplane, ellipsoid, maximal norm)

141 Views Asked by At

Given vector $\boldsymbol{x} \in \mathbb{R}^n$ and real symmetric matrices $\boldsymbol{A} \in \mathbb{R}^n \times \mathbb{R}^n$ and $\boldsymbol{B} \in \mathbb{R}^n \times \mathbb{R}^n$. I am trying to find the value $d^*$ defined as: \begin{equation} d^*=\max_{\boldsymbol{x} \in \mathbb{R}^n} \| \boldsymbol{x} \| \quad s.t. \quad \boldsymbol{x}^T \boldsymbol{A} \boldsymbol{x} = 0 \,,\, \boldsymbol{x}^T \boldsymbol{B} \boldsymbol{x} = 1 \end{equation} Assumptions: The matrix $\boldsymbol{A}$ is positive semidefinite and the matrix $\boldsymbol{B}$ is positive definite. The matrix $\boldsymbol{A}$ has to be degenerate (so that $\boldsymbol{x}^T \boldsymbol{A} \boldsymbol{x}=0$ is feasible for non-zero $\boldsymbol{x}$).

Is it possible to characterize $d^*$ based on properties of $\boldsymbol{A}$ and $\boldsymbol{B}$? Ideally, I am interested in the general case, but if that is hopeless, we can further assume that $\boldsymbol{A}$ and $\boldsymbol{B}$ are only different in terms of their eigenvalues, i.e. $\boldsymbol{A}=\boldsymbol{V} \boldsymbol{D}_1 \boldsymbol{V}^T$ and $\boldsymbol{B}=\boldsymbol{V} \boldsymbol{D}_2 \boldsymbol{V}^T$, where $\boldsymbol{V}$ is some rotation matrix.

Any help is greatly appreciated!

2

There are 2 best solutions below

1
On BEST ANSWER

A good trick whenever you have optimization problems with two matrices where their eigenvalues play a role is to try to co-diagonalize them to simplify the problem. In the case where one of the matrices is PD and the other is PSD, you can always do that by combining a Cholesky decomposition and an eigenvalue decomposition. Starting from your initial statement:

\begin{equation} d^*=\max_{\boldsymbol{x} \in \mathbb{R}^n} \| \boldsymbol{x} \| \quad s.t. \quad \boldsymbol{x}^T \boldsymbol{A} \boldsymbol{x} = 0 \,,\, \boldsymbol{x}^T \boldsymbol{B} \boldsymbol{x} = 1 \end{equation} Now let´s take the Cholesky factors of $B$ as $B=R^TR$. Note that these matrices $R$ are full rank and invertible because $B$ is PD, and in fact you could use any square root of $B$ for our purposes. As it was suggested in another answer, it is easier to play with squared norms, so we substitute $\| \boldsymbol{x} \|$ by $x^Tx$. Finally we apply a change of variables $Rx = y$ and we get: \begin{equation} (d^*)^2=\max_{\boldsymbol{y} \in \mathbb{R}^n} \; (R^{-1} y)^T(R^{-1} y) \quad s.t. \quad (R^{-1} y)^T \boldsymbol{A} (R^{-1} y) = 0 \,,\, y^T y = 1 \end{equation} Half the job is done and now we have the $B$ constraint turned into an identity constraint. Next we do an eigendecomposition of the transformed $A$ as: $R^{-T}AR^{-1} = Q^T\Sigma Q$. Applying a second change of variables $Qy = z$, and using the fact that $z^Tz = y^T y$ because $Q$ is orthonormal, we get: \begin{equation} (d^*)^2=\max_{\boldsymbol{z} \in \mathbb{R}^n} \; z^TQR^{-T}R^{-1}Q^Tz \quad s.t. \quad z^T \Sigma z = 0 \,,\, z^T z = 1 \end{equation} And now we have both constraint matrices in diagonal form! Let´s brand the objective matrix $QR^{-T}R^{-1}Q^T$ as $C$ for clarity. As $A$ was rank deficient, we will have that the first $k$ entries of $\Sigma$ are 0 (or the last $k$, depending on how you have ordered the eigenvalues). The only way for the constraint $z^T \Sigma z = 0$ to be satisfied is if $z$ is non-zero only on those $k$ entries, so we know that the entries from $k+1$ to $n$ are 0 and we can forget about them. The problem reduces to the first entries of $z$, labeled $\hat{z}$ and we get: \begin{equation} (d^*)^2=\max_{\boldsymbol{\hat{z}} \in \mathbb{R}^k} \; \hat{z}^T C_{1:k,1:k}\hat{z} \quad s.t. \quad \hat{z}^T \hat{z} = 1 \end{equation} And this problem is simply an eigenvalue problem on the upper left $k$-by-$k$ corner of $C$, where $d^*$ is the largest eigenvalue of that submatrix. If you were interested in the optimum vector $x$ too, you can zero-pad $\hat{z}$ to get $z$ and then undo the change of variables. If that´s not the case, you only need to get $R$, $Q$ and then compute $\lambda_{max}(Q_{1:k,1:n} R^{-T}R^{-1} Q^T_{1:k,1:n}) = \lambda_{max}(Q_{1:k,1:n} B^{-1} Q^T_{1:k,1:n})$, where $Q_{1:k,1:n}$ is the nullspace of $R^{-T}AR^{-1}$. Hope this helps!

4
On

Here's an idea: I'm disregarding the $A$ constraint since for a non-zero $\mathbf{x}$ it requires $\det(A)=0$ and nothing more.

So we are left with the second constraint. Since $B$ is real symmetric then $U^TBU=D$, for some orthogonal matrix $U$ and diagonal matrix $D$. Moreover, since $B\succ 0$ we know that all the diagonal elements of $D$ are positive, and they are the eigevalues of $B$.

Now, since we want to maximize $\|\mathbf{x}\|$, which is non-negative, this is equivalent to maximizing $\|\mathbf{x}\|^2$. Let's make a change of variables $\mathbf{x}=U\mathbf{y}$, where U is some orthogonal matrix. Then the objective function to maximize is $\|U\mathbf{y}\|^2\overset{(*)}{=}\|\mathbf{y}\|^2=\sum_{i=1}^n y_i^2$, where (*) comes from the fact that $U$ is orthogonal. For the constraint we have $\mathbf{x}^TB\mathbf{x}=\mathbf{y}^TU^TBU\mathbf{y}=\mathbf{y}^TD\mathbf{y}=\sum_{i=1}^n d_iy_i^2$.

Now the problem takes the form $$\max_{y_i}\sum_{i=1}^n y_i^2 \\ s.t. \sum_{i=1}^n d_iy_i^2$$

To solve this problem we look for the minimum diagonal element $d_{\min}$. Let's sssume index $j$ has the minimum value. Then set $y_j=d_j^{-1}$ and $y_i=0,\; \forall i\neq j$.

Going back to the original problem, the optimal value $d^*$ is simply the inverse of the minimum eigenvalue of $B$.