Is this SDP analytically solvable?

48 Views Asked by At

I am studying the following semi-definite problem:

$$\begin{array}{rl} \textrm{Given:} & W \in \mathbb{R}^{n \times n} \\ \textrm{Minimize:} & u_1 + u_2 + \ldots + u_n \\ \textrm{Subject to:} & \forall i \in [n]:\\ & \begin{bmatrix} X & e_i \\ e_i^\intercal & u_i \end{bmatrix} \succeq 0 \\ & (W^\intercal X W)_{ii} \leq 1 \end{array}$$

where $e_i$ is the $n$-dimensional column vector with 1 in the $i$-th position and 0 elsewhere.

As part of trying to understand it, I have been looking at the case where $W = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix}$. Setting $X = \begin{bmatrix} a & b \\ b & c\end{bmatrix}$ turns the SDP into:

$$\begin{array}{rl} \textrm{Minimize:} & u_1 + u_2 \\ \textrm{Subject to:} \\ & \begin{bmatrix} a & b & 1 \\ b & c & 0 \\ 1 & 0 & u_1 \end{bmatrix} \succeq 0 \\ & \begin{bmatrix} a & b & 0 \\ b & c & 1 \\ 0 & 1 & u_2 \end{bmatrix} \succeq 0 \\ & a + 2b + c \leq 1 \\ & a \leq 1 \end{array}$$

Looking at the principal minors of the two matrices gives the following constraints (after removing several redundancies):

$$\begin{eqnarray} a, c, u_1, u_2 & \geq & 0 \\ ac - b^2 & \geq & 0 \\ a u_1 & \geq & 1 \\ c u_2 & \geq & 1 \\ (ac - b^2) u_1 & \geq & c \\ (ac - b^2) u_2 & \geq & a \\ \end{eqnarray}$$

I have tried to find the optimum via the KKT conditions, by constructing the Lagrangian:

$$\begin{eqnarray} \mathcal{L}(a, b, c, u_1, u_2; \lambda) & = & u_1 + u_2 - \lambda^\intercal \begin{bmatrix} a \\ c \\ u_1 \\ u_2 \\ ac - b^2 \\ 1 - a u_1 \\ 1 - c u_2 \\ (ac - b^2)u_1 - c \\ (ac - b^2)u_2 - a \\ 1 - a \\ 1 - (a + 2b + c) \end{bmatrix} \end{eqnarray}$$

and setting the derivatives to zero, but I couldn't see an obvious solution.

So, is there a way to analytically find the optimum? Is there a simplification or modification or duality trick that I've missed?

1

There are 1 best solutions below

3
On

First, the $n$ constraints $$ \begin{bmatrix} X & e_i \\ e_i^\intercal & u_i \end{bmatrix} \succeq 0 $$ can each be equivalently replaced by the following three constraints due to the generalized Schur complement lemma (see also here in case this link breaks) $$ \begin{align} X &\succeq 0 \tag{1} \\ (I - XX^\dagger)e_i &= 0 \tag{2} \\ u_i - e_i^T X^\dagger e_i &\geq 0 \tag{3} \end{align} $$ where $X^\dagger$ is the pseudo-inverse of $X$. Next, the $n$ constraints $(I - XX^\dagger)e_1 = 0, \dots, (I - XX^\dagger)e_n = 0$ can be combined together to form the equivalent constraint $(I - XX^\dagger)I = 0$, or $XX^\dagger = I$. Because $X^\dagger$ always exists and is unique for every $X$, then the equaton $XX^\dagger = I$ implies that $X^\dagger$ is the right-inverse of $X$. However, because $X$ is square, then $X^\dagger$ is necessarily the left-inverse (and therefore the inverse) of $X$. This implies that $X$ is invertible. Moreover, because $X$ is positive semi-definite and invertible, then it is necessarily positive-definite. This implication also goes in the opposite direction, so we can replace constraints $(1), (2),$ and $(3)$ with the following equivalent constraints $$ \begin{align} X &\succ 0 \tag{4} \\ u_i &\geq e_i^T X^{-1} e_i \tag{5} \end{align} $$ We now turn our attention to the $n$ constraints $(W^\intercal X W)_{11} \leq 1, \dots, (W^\intercal X W)_{nn} \leq 1$. Each of these constraints can also be written as $e_i^TW^TXWe_i \leq 1$. At this point, we can proceed by choosing $X \succ 0$ such that $e_i^TW^TXWe_i \leq 1$ for each $i$. Then, by constraint $(5)$ above, the minimum value of $u_i$ is $e_i^T X^{-1} e_i$, such that the minimum value of your problem becomes $\sum_{i=1}^n e_i^T X^{-1} e_i$. Unfortunately, I'm not sure of any way to simplify the problem further without additional constraints.


If we add the constraint that $X = \varepsilon I$ for some $\varepsilon > 0$ (since $X \succ 0$), then the problem can be simplified further.

In this case, the constraint $e_i^TW^TXWe_i \leq 1$ becomes $\varepsilon \cdot e_i^TW^TWe_i \leq 1$. Therefore, fix any $W$. Next, let $c = \max\{e_i^TW^TWe_i \mid i \in \{1,\dots,n\}\}$. If $c = 0$, then choose any $\varepsilon > 0$. Otherwise, choose any $\varepsilon$ such that $0 < \varepsilon \leq \frac{1}{c}$. This choice ensures that the $n$ constraints $(W^\intercal X W)_{11} \leq 1, \dots, (W^\intercal X W)_{nn} \leq 1$ are satisfied for $X = \varepsilon I$.

Finally, we are left with the $n$ constraints $u_1 \geq e_1^T X^{-1} e_1, \dots, u_n \geq e_n^T X^{-1} e_n$ given in $(5)$. Because $X = \varepsilon I$, then $X^{-1} = \frac{1}{\varepsilon}I$, such that each of these constraints becomes $u_i \geq \frac{1}{\varepsilon} e_i^T e_i$. Because $e_i^T e_i = 1$ for each $i$, then each constraint becomes $u_i \geq \frac{1}{\varepsilon}$.

Therefore, to solve your problem, we choose $u_i = \frac{1}{\varepsilon}$ for each $i$, such that the minimum value is $\sum_{i=1}^n u_i = \frac{n}{\varepsilon}$. Because this minimum value depends on $\varepsilon$, and because we have some control over the choice of $\varepsilon$, then we can improve the strategy for choosing $\varepsilon$ mentioned above. More specifically, if $c = 0$, then let $\varepsilon = \infty$. Otherwise, let $\varepsilon = \frac{1}{c}$, such that $\frac{1}{\varepsilon}$ is minimized.

Because the choice of $\varepsilon$ depends on $W$, then the choice of each $u_i$, and the minimal value of your problem, will also depend on $W$.