Minimize $\log|\Sigma|+\text{Tr}[\Sigma^{-1}S]$ with $\Sigma=AA'+\Phi$

55 Views Asked by At

Let $S,\Phi$ be $n\times n$ real positive definite matrices, with $\Phi$ diagonal. Consider the problem of minimizing

$$L(A)=\log|\Sigma|+\text{Tr}[\Sigma^{-1}S], \quad \Sigma=AA'+\Phi,$$

subject to $A\in \mathbb R^{n\times k}$, where $k<n$. I want to show the following proposition :

Proposition. Let $UDU'$ be an eigendecomposition of $\Phi^{-1/2}S\Phi^{-1/2}$, with diagonal elements of $D$ in decreasing order. Then the map $A\mapsto L(A)$ is minimized at

$$A^*=\Phi^{1/2}U_k\max\{D_k-I_k,0\}^{1/2},$$

where $U_k$ consists of the first $k$ columns of $U$, and $D_k$ is the $k\times k$ upper-left block of $D$.

Proof. The first step is to transform the optimization problem into an equivalent one. Using the properties of determinant and trace we have

$$\min_{A\in \mathbb R^{n\times k}} \log|\Sigma|+\text{Tr}[\Sigma^{-1}S], $$

$$\iff \min_{A\in \mathbb R^{n\times k}} \log|\Phi^{1/2}\Phi^{-1/2}\Sigma\Phi^{-1/2}\Phi^{1/2}|+\text{Tr}[\Phi^{1/2}\Sigma^{-1}\Phi^{1/2}\Phi^{-1/2}S\Phi^{-1/2}],$$

$$\iff \min_{A\in \mathbb R^{n\times k}} \log|\Phi|+\log|\Sigma^*|+\text{Tr}[{\Sigma^*}^{-1}S^*], $$

$$\iff \min_{B\in \mathbb R^{n\times k}} \log|\Sigma^*|+\text{Tr}[{\Sigma^*}^{-1}S^*], $$

where $\Sigma^*=\Phi^{-1/2}\Sigma\Phi^{-1/2}=BB'+I_n$, $B=\Phi^{-1/2}A$, and $S^*=\Phi^{-1/2}S\Phi^{-1/2}$.

Now $BB'$ is positive semi-definite with rank at most $k$. So its eigendecomposition can be written $(A^*)(A^*)'=QVQ'$ where $Q$ is $n\times k$ with $Q'Q=I_k$ and $V$ is $k\times k$ diagonal with non-negative diagonal entries in decreasing order. Conversely if $Q,V$ satisfy these properties then $QVQ'=BB'$ with $B=QV^{1/2}$. Therefore the preceding problem is equivalent to

$$\iff \min_{Q\in \mathbb R^{n\times k}\\ V\in \mathbb R^{k\times k} \\ Q'Q=I_k \\ V\geq 0 \text{ diagonal } } \log|\Sigma^*|+\text{Tr}[{\Sigma^*}^{-1}S^*], $$

with $\Sigma^*=QVQ'+I_n$ and diagonal elements of $V$ in decreasing order.

Now if $Q \in \mathbb R^{n\times k}$ satisfies $Q'Q=I_k$ then we can write $Q=UM'$ with $M=Q'U\in \mathbb R^{k\times n}$ satisfying $MM'=I_k$. Conversely if $M \in \mathbb R^{k\times n}$ satisfies $MM'=I_k$ then $Q=UM'$ satisfies $Q'Q=I_k$. Therefore the preceding problem is equivalent to

$$\iff \min_{M\in \mathbb R^{k\times n}\\ V\in \mathbb R^{k\times k} \\ MM'=I_k \\ V\geq 0 \text{ diagonal } } \log|\Sigma^*|+\text{Tr}[{\Sigma^*}^{-1}S^*] $$

with $\Sigma^*=UM'VMU'+I_n=U[M'VM+I_n]U'$ and diagonal elements of $V$ in decreasing order.

We are now going to solve this last problem. First we have, using Sylvester's determinant theorem,

$$|\Sigma^*|=|U[M'VM+I_n]U'|=|M'VM+I_n|=|V+I_k|=\prod_{j=1}^k (1+v_j)$$

Second, by the Woodbury matrix identity we have $$ {\Sigma^*}^{-1}=U[I_n-M'V(I_k+V)^{-1}M]U'$$ and so

$$\text{Tr}[{\Sigma^*}^{-1}S^*]=\text{Tr}[[I_n-M'V(I_k+V)^{-1}M]D]=\text{Tr}[D]-\text{Tr}[M'V(I_k+V)^{-1}MD]$$

$$=\sum_{i=1}^n d_i -\sum_{j=1}^k \sum_{i=1}^n m_{ji}^2 \frac{v_jd_i}{1+v_j}$$

Therefore the value of the objective function is

$$\sum_{i=1}^n d_i+\sum_{j=1}^k \log(1+v_j)-\sum_{j=1}^k \sum_{i=1}^n m_{ji}^2 \frac{v_jd_i}{1+v_j}$$

Since the $d_i$'s and $v_j$'s are non-negative and in decreasing order, for a fixed $V$ the above function is minimized by choosing $m_{ji}=1$ if $j=i$ and $0$ otherwise, and the resulting objective function is

$$\sum_{i=1}^n d_i+\sum_{j=1}^k \log(1+v_j)-\sum_{j=1}^k \frac{v_jd_i}{1+v_j}$$

Using calculus one can check that $v_j\mapsto\log(1+v_j)-\frac{v_jd_i}{1+v_j}$ subject to $v_j\geq 0$ is minimized at $\max\{d_i-1,0\}$, $j=1\dots,k$.

In summary the minimum of the last problem is attained at $M=[I_k : 0_{k\times n-k}]$ and $V=\max\{D_k-I_k,0\}$. By reversing the steps we get that $A \mapsto L(A)$ attains a minimum at $A^*=\Phi^{1/2}U_k\max\{D_k-I_k,0\}^{1/2}$.

Is this reasoning correct? Thanks a lot for your help.