Maximize $\log\det(\mathbf{I} + \sum_{i=1}^K \mathbf{H}_i \mathbf{WW}^\top \mathbf{H}_i^\top)$ wrt $\mathbf{W}$ over hypersphere manifold

36 Views Asked by At

Recently I encountered an optimization problem in my research, namely,

$$\max_{\mathbf{W}\in \mathcal{S}}\log\det \left(\mathbf{I} + \sum_{i=1}^K \mathbf{H}_i \mathbf{WW}^\top \mathbf{H}_i^\top\right)$$

where

  • $\mathbf{I}$ the identity matrix
  • $\mathbf{H}_i\in \mathbb{R}^{M\times N}$ some arbitrary matrices for all $i=1\dots K$
  • $\mathcal{S}=\{\mathbf{W}\in \mathbb{R}^{N\times S} | \|\mathbf{W}\|_F = 1\}$ and $S<N$, i.e., the hypersphere manifold

This problem is nonconvex in $\mathbf{W}$, so I tried to use manifold optimization techniques (Riemannian gradient descent) by manopt to solve it. An interesting observation is that the resulting $\mathbf{WW}^\top$ always converges to the same matrix thus the objective value converges to the same value, regardless of the chosen initial points. I wondered if the result was the global optimum and tried to figure out the reasons.

Could you please give me some hints or references for it? A small MATLAB code snippet to solve the problem using manopt:

M = 8;
N = 6;

num_H = 6;

H = zeros(M,N,num_H);
for i = 1:num_H
    H(:,:,i) = randn(M,N);
end

%%
S = 2;
manifold = spectrahedronfactory(N, S);
problem.M = manifold;
problem.cost = @(x) -cost(num_H, H, M, x);
problem = manoptAD(problem);
% [x, xcost, info] = steepestdescent(problem);     
[x, xcost, info] = conjugategradient(problem);


figure;
semilogy([info.iter], [info.gradnorm], '.-');
xlabel('Iteration #');
ylabel('Gradient norm');
title('Convergence on the sphere');

x * x'
xcost

function obj = cost(num_H, H, M, x)
    sum = 0;
    for i = 1:num_H
        sum = sum + H(:,:,i) * (x * x') * H(:,:,i)';
    end
    obj = log(det(eye(M) + sum));
end

Many thanks!