Bregman Matrix Divergence induced by trace norm

88 Views Asked by At

I am studying the Bregman Matrix Divergence of symmetric matrices, https://web.stanford.edu/group/mmds/slides/dhillon-mmds.pdf

which defined as

$D_\psi (X,Y) = \psi(X)-\psi(Y) - \text{tr}((\nabla\psi(Y))^\top(X-Y))$,

where $\text{tr}(X)$ is the trace of $X$. It seems that commonly used $\psi$ include $\psi(X) = \frac{1}{2}\text{tr}(X^\top X)$, $\psi(X) = \text{tr}(X\log X-X)$, and $\psi(X) = -\log \det (X)$.

I was wondering if I can use the squared trace norm $\psi(X) = ||X||_*^2 = (\text{tr}(\sqrt{X^\top X}))^2$, but I have never seen any reference.

In addition, if I consider symmetric positive definite matrices, then I have $\psi(X) = (\text{tr}(X))^2$, and therefore

$D_\psi (X,Y) = (\text{tr}(X))^2 - (\text{tr}(Y))^2 - 2\text{tr}(Y) \cdot \text{tr}(X-Y) = (\text{tr}(X)-\text{tr}(Y))^2$,

which is really weird to me, because as far as I know, $D_\psi (X,Y)=0$, iff $X=Y$. But here we only require $\text{tr}(X)=\text{tr}(Y)$.

Is there anything wrong with my derivation or could anyone tell me some related references?

Thanks a lot!

1

There are 1 best solutions below

3
On

This is because the $\Psi $ function is not strictly convex.