A bilinear matrix inequality

418 Views Asked by At

Let $Q\in\mathbb{S}_{++}^{n\times n}$ be a given positive definite matrix and $b\in \mathbb{R}^n$ be a given vector. Consider the following optimization problem $$ \begin{array}{cl} \min_{x,P,y_1,\ldots,y_m} &x^\top Q x + b^\top x\\ &P\succeq \mathbb{I}\\ &a_i^\top P y_i \le -1, \quad \forall i=1,\ldots,k\\ &y_i=K_ix, \quad \forall i=1,\ldots,k\\ \end{array} $$ where $P\in\mathbb{R}^{m\times m}$, $\mathbb{I}$ is the identity matrix, $K_i\in \mathbb{R}^{m\times n}$ are give matrices and $a_i\in \mathbb{R}^{m}$ are given vectors.

Here I have two questions: 1. Is there any meyhod to solve this optimization problem, especially when $k$ is large? 2. Is there a suitable change of variable such that the resulting problem become convex or nice in any sense?

1

There are 1 best solutions below

0
On

To make the problem more concise, I substitute $y_i = K_ix$, then I obtain the following.

$$ \begin{array}{cl} \min_{x,P} &x^\top Q x + b^\top x\\ &P\succeq \mathbb{I}\\ &a_i^\top PK_ix \le -1, \quad \forall i=1,\ldots,k \end{array} $$

Maybe, there is some other elegant solution. I think the following way is one way to make the problem nicer.

Reformulation with Augumented Lagrangian

Substitute $\theta_i = PK_ix$, now constructing the augumented Lagrangian (the ADMM way [1]), we have $$ \begin{array}{cl} \min_{x,P,\theta_1,\ldots,\theta_m}\max_{\mu_1\ldots\mu_k} &x^\top Q x + b^\top x + \sum_{i=1}^k\mu_i^T(\theta_i - PK_ix) + \frac{\rho}{2}\sum_{i=1}^k\|\theta_i - PK_ix\|_2^2\\ &P\succeq \mathbb{I}\\ &a_i^\top \theta_i \le -1, \quad \forall i=1,\ldots,k\,, \end{array} $$ where $\mu_i,\theta_i \in \mathbb{R}^m$ for all $i \in [m]$ and $\rho>0$. Note that $\rho$ is chosen before optimization starts and larger $\rho$ should work well.

Now, using Alternating updates (ADMM), it is straight forward to handle the constraints, as all the involved constraints are closed convex sets. You may also refer [2].

Tip for large $k$:

The $\theta_i$ variables are not dependent on each other, hence $\theta_i$ updates can be done in parallel. The same also applies also for $\mu_i$.

References:

[1] https://web.stanford.edu/~boyd/papers/pdf/admm_distr_stats.pdf

[2] https://web.stanford.edu/~boyd/papers/pdf/admm_slides.pdf