Largest eigenvalue as Semidefinite Program in standard form

817 Views Asked by At

I'm trying to understand how finding the largest eigenvalue can be phrased as an SDP. More or less everything I know about SDP comes from here. The problem itself is on page $5$.

Given a PSD matrix $A$ with eigenvalues $\lambda_1 \geq \lambda_2 \ldots \geq \lambda_n$, it's clear that the eigenvalues of $tI - A$ are $t - \lambda_i$ (where $I$ represents the identity matrix). Then $tI - A$ has all positive eigenvalues only when $t \geq \lambda_1$. Thus $$\lambda_1 = \text{min} \{ t \; | \; tI -A \succeq 0\}$$

Where I'm using $\succeq 0$ to mean the PSD ordering. I know SDP problems can take the form

\begin{align*} \text{minimize} \quad & C \bullet X \\ \text{s.t} \quad & A_i \bullet X \geq b_i \quad i = 1, \ldots, m \\ & X \succeq 0 \\ \end{align*}

How can I phrase the first minimum as a minimum of this form? What are the $C,X,A_i, b_i$? Even further, if $A(x)$ is a PSD matrix given by a vector $x$, how could I phrase minimizing the largest eigenvalue of $A(x)$ as an SDP in this form? Again I'm not sure how to choose the $C,X$, etc. This comes from trying to understand this.

2

There are 2 best solutions below

2
On BEST ANSWER

The easiest strategy here is to realize that SDP solvers solve SDPs defined by data $(b,C,A)$ defining the primal and dual pair, and they solve both simultaneously.

The primal (which you are trying to fit your model to)

\begin{align*} \text{minimize} \quad & C \bullet X \\ \text{s.t} \quad & A_i \bullet X = b_i \quad i = 1, \ldots, m \\ & X \succeq 0 \\ \end{align*}

and the dual

\begin{align*} \text{maximize} \quad & b^Ty \\ \text{s.t} \quad & C - \sum_{i=1}^m A_i y_i \succeq 0 \\ \end{align*}

The problem you are trying to solve is easily interpreted from the dual side

\begin{align*} \text{maximize} \quad & -t \\ \text{s.t} \quad & -A - (-I)t \succeq 0 \\ \end{align*}

I.e you already have data to send to the solver $(-1,-A,-I)$. The dual solution returned by the solver is your $t$.

7
On

The trick is to formulate the constraints such that $X$ becomes: $$X = \begin{pmatrix}t^{+} & 0 & O \\ 0 & t^{-} & O \\ O & O & tI-A \end{pmatrix},$$ with the interpretation $t = t^+ - t^-$. You can then take $$C = \begin{pmatrix}1 & 0 & O \\ 0 & -1 & O \\ O & O & O \end{pmatrix}.$$ You need $2$ inequality constraints per element of $A$ to set the $tI-A$ block in $X$. For example, to set $X_{33}$ equal to $(tI-A)_{11}$ you need the inequalities $X_{33} - t \geq -A_{11}$ and $-(X_{33} - t) \geq A_{11}$:

$$ \begin{pmatrix}-1 & 0 & 0 & O \\ 0 & 1 & 0 & O \\ 0 & 0 & 1 & O \\ O & O & O & O \end{pmatrix} \bullet X \geq -A_{11} $$

$$ \begin{pmatrix}1 & 0 & 0 & O \\ 0 & -1 & 0 & O \\ 0 & 0 & -1 & O \\ O & O & O & O \end{pmatrix} \bullet X \geq A_{11}. $$

Naively you could generate $2n^2$ constraints, but due to symmetry you only need $2 \cdot 0.5 n(n+1)$.

As pointed out in the comments, this formulation is very inefficient because it has variables. The off-diagonal elements of $X$ are constrained to fixed values, but solvers do not take advantage of this and just treat those elements as variables.