Computation of Riemannian Hessian of the Stiefel manifold

135 Views Asked by At

I am fairly new to differential geometry, so I'm sorry if I am sloppy in my notations.

In the book Optimization Algorithms on Matrix Manifolds, the Riemannian Hessian is given by definition 5.1 as:

$$Hessf(x)[ξx] = ∇_{ξx} grad f $$

Where $f$ is a real-valued function on $M$, ξx $\in$ $T_xM$ (tangent space of M), and ∇ is the Riemannian connection on $M$.

Using the definition of the Riemannian connection for the Stiefel manifold (eq. 5.17):

$$∇η_X ξ = P_X (D ξ (x) [η_X ])$$

With $P_X$ the projection to $T_xM$. We can then write the Riemannian Hessian as:

$$ Hessf(x)[ξx] = P_X (D grad f (x) [ξx ]) $$

Now my questions are:

  • What is the operator $D$ in this equation?
  • How would I "evaluate" at $ξ$? Do I just multiply this tangent vector with the resulting Hessian matrix?

My guess is that to somehow match the Euclidean Hessian, $D$ should be the Jacobian. What confuses me is that this operator is used both for scalar-valued, as well as for vector-valued functions.

1

There are 1 best solutions below

7
On BEST ANSWER

$\newcommand{\rgrad}{\mathrm{rgrad}}$ $\newcommand{\grad}{\mathrm{grad}}$ $\newcommand{\hess}{\mathrm{hess}}$ $\newcommand{\rD}{\mathrm{D}}$ $\newcommand{\rP}{\mathrm{P}}$ Try this :-). https://link.springer.com/article/10.1007/s10957-023-02242-z, read cube: https://rdcu.be/ddyWJ

which also has the Stiefel manifold formula worked out.

In matrix/vector calculus language, $\rD$ is just the directional derivative. Thus, $\rD F(x)[\xi_x]$ is just the directional derivative $$\rD_{\xi_x} F(x) = \lim_{h\to 0}\frac{1}{h}(F(x+h\xi_x) - F(x)).$$ This is well-defined for a tangent vector $\xi_x$ at $x$. But yes, $\rD$ is probably more properly defined as the covariant derivative in the ambient space. Covariant derivative is defined for vector fields, but in any case - the evaluation is the above directional derivative. Here, we extend a function on the submanifold $M$ (the Stiefel manifold in this case) to a function $F$ in the ambient space near $M$. The main idea is $\rD_{\xi_x} F(x)$ could be evaluated by calculus.

For the case $F$ is the Riemannian gradient $\rgrad_f$, for $f(x) = x^TAx$, the Rayleigh quotient for a symmetric matrix $A$, for example, then $\rgrad_f(x) = 2\rP_xAx$. Here, the ambient gradient is $2Ax$ by calculus, $\rP_x$ is the metric-compatible projection to the tangent of the Stiefel manifold, using the embedded metric - as is the case in the book, $$\rP_x\omega = \omega - x(x^T\omega)_{sym}$$ where $X_{sym} = \frac{1}{2}(X+X^T)$. Thus: $$\rgrad_f(x) = 2Ax - 2x(x^TAx)_{sym} = 2Ax - 2xx^TAx,\\ \rD_{\xi_x}\rgrad_f(x) = 2A\xi_x - 2\xi_xx^TAx - 2x\xi_x^TAx - 2xx^TA\xi_x $$ and the Riemannian Hessian is $\rP_x\{\rD_{\xi_x}\rgrad_f(x)\}$.