On a constrained trace minimisation problem

408 Views Asked by At

Suppose $S = \pmatrix{1&1\\ 1&0\\ 0&1}$, $W$ is a $3\times3$ covariance matrix, which could be regarded as fixed.

I need to find a $2\times 3$ matrix $Q$ that minimizes $$ \operatorname{trace}\left[(SQ)W(SQ)^T\right] $$ subject to $SQS=S$.

Please help me to solve this issue!!!...

Thanks

2

There are 2 best solutions below

11
On

I have devided my answer in two parts. First I introduce the formulation. Second, I consider the optimization.

1) Formulation: For simplicity let me rewrite your problem in vector form and formulate it as a quadratic programming problem of the form $$ \min_q q^T R q ~~~~~~{\rm s.t.}~~~~~~ Aq = s. $$

The following definitions are necessary to forumulate the constraints. For convencience, we consider $S^TQ^TS^T=S^T$ instead of $SQS=S$.

  • $q={\rm vec}(Q^T)$, i.e. if $Q^T$ consists of the column vectors of $Q^T = [q_1,q_2]$, then $q$ is the vector which contains both.
  • $s={\rm vec}(S^T)$.

  • $A = S \otimes S^T$ where $\otimes$ is the Kronecker product. According to http://en.wikipedia.org/wiki/Kronecker_product we can then rewrite $S^TQ^TS^T=S^T$ equivalently as $Aq=s$.

Let us now rewrite the target function:

  • Let us define the matrix $P=QS$ with the three row vectors $p_1$, $p_2$ and $p_3$ and we have $trace[SQW(SQ)^T]=\sum_{i=1}^3 p_i W p_i^T$.
  • Moreover, we define $p = [p_1,p_2,p_2] = {\rm vec}^T\{(SQ)^T\}$ and $V = \rm{blockdiag}[W,W,W]$. ($V$ is the blockdiagonal matrix which has three $W$s on its diagonal) to have $trace[SQW(SQ)^T]=\sum_{i=1}^3 p_i W p_i^T = p V p^T$.
  • Exploiting the propoerties of the Kronecker product, we have $p^T = {\rm vec}\{(SQ)^T\} = (S \otimes \mathbf{I})q $ where $I$ is the 3x3 identiy matrix.
  • Finally we can rewrite $trace[SQW(SQ)^T]=q^T R q$ where $R = (S \otimes \mathbf{I})^T V (S \otimes \mathbf{I})$

2) Optimization: Having the optimization problem in vector, it is straight forward to solve it, see http://en.wikipedia.org/wiki/Quadratic_programming. Note that, even though $A$ is 6x6 and $s$ also has length 6 there are still degrees of freedom left for the optimization as $A$ does not have full rank (I checked this with Matlab).

One possible algorithm to solve the optimization problem is explained here: Let $q_0 $ be a particular solution to the equation system such that $Aq_0 = s$ and let $q_1$ element of the null space of $A$ such that $A q_1 = 0$. Let $B $ contain a basis for the null space (B can be found by using SVD for example) such that $q_1$ can be expressed as $q_1 = B x$, where $x$ is an arbitrary vector such that $ABx = 0$. Then the quadratic programming problem can be formulated in respect to the free variable $x$ as

$$ \min_{q_1} (q_1 + q_0)^T R( q_1 +q_0) = \min_{x} (Bx + q_0)^T R( Bx+q_0)= \min_x x^T B^T R B x + 2 x^T B^T R q_0 + const. $$ Note that $q_0$ is fix here!

The solution $x^\star$ to this problem is given by $x^\star= (B^T R B)^{-1} B^T R q_0$.

With $x^\star$ you find $q_1$, together with $q_0$ you find the matrix $Q$.

12
On

Presumably, all matrices here are real. If $S\in M_{m+n,\,n}(\mathbb R)$ has full column rank, the equation $SQS=S$ actually has a unique solution: since $Q$ has the same size as $S^T$ and the latter has full row rank, $Q=XS^T$ for some square matrix $X\in M_n(\mathbb R)$. Therefore $SQS=S$ then implies that $S^TSXS^TS=S^TS$, i.e. $X=(S^TS)^{-1}$. Consequently, $Q=(S^TS)^{-1}S^T$.