Matrix optimization

445 Views Asked by At

I'm trying to minimize over $U$ the objective $\|X^{\top}UU^{\top}UU^{\top}X\|_F^2 = \text{trace}(X^{\top}UU^{\top}UU^{\top}XX^{\top}UU^{\top}UU^{\top}X)$ subject to $U^{\top}U = I$, where $X \in \mathbb{R}^{D \times N}$ and $U \in \mathbb{R}^{D \times d}$ for a given $d$. The constraint simplifies the objective slightly to $\text{trace}(X^{\top}UU^{\top}XX^{\top}UU^{\top}X)$. I'm guessing that the method of Lagrange multipliers is the way to go, but I'm not even sure how to take the derivative of the objective.Could someone point me in the right direction? I'm thinking that perhaps the way to go is taking the derivative with respect to $UU^{\top}$ and working from there.

1

There are 1 best solutions below

0
On

The problem may be simplified by working with $Y = XX^{\top}$, since $$ \text{trace}(X^{\top}UU^{\top}XX^{\top}UU^{\top}X) = \text{trace}( U^{\top} YU U^{\top} YU). $$ Then write $Y = VSV^{\top}$, where $S$ is $D \times D$ diagonal with non-negative non-increasing diagonal entries $s_j$ and $V$ is $D \times D$ orthogonal. Setting $\tilde U = V^{\top} U$, the problem now is to minimize $$ \text{trace}( \tilde U^{\top} S \tilde U \tilde U^{\top} S \tilde U) $$ Now choose $\tilde U$ to be the last $d$ columns of the $D \times D$ identity matrix (so $U$ consists of the last $d$ columns of $V$). Then your functional turns into $$ \text{trace}( U^{\top} YU U^{\top} YU) = \sum_{j = D-d+1} s_j^2 $$ that is the sum of squares of the last $d$ entries of $S$. If $N \le D-d$, then these entries are zero and clearly the result is minimal.

I suspect that this choice of $\tilde U$ also gives the minimal solution in the general case.