Is there a closed form to this quadratic program?

168 Views Asked by At

Problem :

I am currently trying to solve some optimization problem to get some information about a matrix. In the following, when we have a matrix $A\in\mathbb R^{m\times n}$, $0\leq A$ means that $0\leq A_{ij}$ for all $i$ and $j$. The objective is quadratic and its dual is \begin{align*} \min_{\substack{U\in\mathbb R^{m\times m}:\\0\leq U}} \operatorname{Tr}[(I+U) M (I+U^T)] \end{align*} with $M$ a PSD and symmetric matrix in $\mathbb R^{m\times m}$.

After some simplifications, the KKT conditions are equivalent to (the Lagrange multipliers are $M(I+U^T)$) \begin{align*} \begin{cases} 0\leq U\\ 0\leq M (I+U^T)\\ \operatorname{Tr}[U M (I+U^T)]=0 \end{cases} \end{align*}

Since we are minimizing a convex function under convex domain, the KKT are sufficient, therefore any $U$ satisfying those is optimal for the dual problem.

Note that If we know the indices $\mathcal I=\{ (i,j) : U_{ij}>0 \}$, then from the three KKT conditions, we get that $[M(I+U^T)]_{ji}=0$ for all $(i,j)\in\mathcal I$, this gives a system of $|\mathcal I|$ linear equations in $|\mathcal I|$ unknown and can therefore be solved efficiently.

I implemented this using a quadratic programing solver and after some inspection, it feels like the following is true :

For any $i$, $j$, if we have $M_{ij} > 0$ then we have $U_{ij} = 0$.

I cannot prove or disprove it, any help would be appreciated.


Discussions :

This problem can be reduced to a much simpler one with vectors. Let $u_i$ be the $i$'th line vector in $U$, then \begin{align*} \operatorname{Tr}[(I+U)M(I+U^T)] &= \sum_i (u_i+e_i^T) M (u_i^T + e_i) \end{align*} Therefore we can find $U$ by solving $m$ problems of the form \begin{align*} u_i^T \in \operatorname{argmin}_{0\leq u} (u+e_i)^T M (u+e_i) \end{align*} for each $i$.

The KKT conditions of such problem are given by splitting the KKT above similarly \begin{align*} \begin{cases} 0\leq u\\ 0\leq M (u+e_i)\\ u^T M (u+e_i)=0 \end{cases} \end{align*}

It is not to hard to prove that for the $i$'th problem, the solution $u_i$ has $0$ entry at index $i$. Indeed the unconstrained optimum is $-e_i$ and if we have a candidate $u$ with $u_i >0$, then $\frac{1}{1+u_i} u + \frac{u_i}{1+u_i} (-e_i)$ is a convex combination of our candidate with the unconstrained optimum that has all positive components. This means that $u_i=0$.

Note that in case we know what indices of $u$ are zero, then we can obtain the value of the rest by solving a simple linear system. Indeed if $\mathcal I=\{ j : u_j>0 \}=\{ j_1, j_2,\dots, j_k \}$, let \begin{align*} u_{\mathcal I} &= \begin{bmatrix} u_{j_1}\\ u_{j_2}\\ \vdots\\ u_{j_k} \end{bmatrix}\\ M_{\mathcal I\mathcal I} &= \begin{bmatrix} M_{j_1j_1} & M_{j_1j_2}&\dots&M_{j_1j_k}\\ M_{j_2j_1} & M_{j_2j_2}&\dots&M_{j_2j_k}\\ \vdots&\vdots&\ddots&\vdots\\ M_{j_kj_1} & M_{j_kj_2}&\dots&M_{j_kj_k}\\ \end{bmatrix}\\ m_{i\mathcal I} &= \begin{bmatrix} M_{ij_1}\\ M_{ij_2}\\ \vdots\\ M_{ij_k} \end{bmatrix} \end{align*} Then from the KKT conditions, $u_{\mathcal I}$ satisfies $M_{\mathcal I\mathcal I} u_{\mathcal I}=-m_{i\mathcal I}$. This means that in order to solve the original problem, we could search what are the indices that are zero.

From this, I think we can rephrase the above problem as follow :

Suppose we have \begin{align*} M=\begin{bmatrix} m_{11}&m_1^T\\ m_1&M_{11} \end{bmatrix} \end{align*} a positive definite matrix of dimension $m\times m$, and a vector $0<u\in\mathbb R^{m-1}$ such that \begin{align*} M_{11} u = -m_1 \end{align*} Then $m_1<0$.