Computing explicitly the Riemannian Distance on $GL_n^+$

981 Views Asked by At

$\newcommand{\ep}{\epsilon}$

Let $GL_n^+$ be the Lie group of invertible $n \times n$ matrices with positive determinant. In particular it's a connected open submanifold of the Euclidean space $\mathbb{R}^{n^2}$.

Now consider it with the induced metric from $\mathbb{R}^{n^2}$. (So it's a Riemannian submanifold of $\mathbb{R}^{n^2}$).

For clarity, this means endowing $GL_n^+ $ with the pullback metric of the Euclidean metric $e$ along the inclusion $ i:GL_n^+ \to (\mathbb{R}^{n^2},e) $. Explicitly: $g_z(X,Y) =tr(X^TY)$ , where $X,Y \in T_z(GL_n^+)$.

Questions:

(1) Is there an explicit formula for the Riemannian distance between two matrices $A,B \in GL_n^+$?

Conjecture: The Riemannian distance equals the Euclidean one.

Attempted proof:

Since $GL_n^+$ is open in $\mathbb{R}^{n^2}$, it follows that it's a totally geodesic submanifold. (That is, all it's geodesics are geodesics in $(\mathbb{R}^{n^2},e)$, i.e they are the usual straight lines in Euclidean space).

$GL^+(\mathbb{R}^n)$ is open $\Rightarrow$ for any $A \in GL^+(\mathbb{R}^n)$, there is an Euclidean ball centered around it which is contained in $GL_n^+$. Hence, for all matrices close enough to $A$ their distance from $A$ is just the Euclidean one. (since the straight line beteen them is in our submanifold).

Now consider $A,B \in GL_n^+$. Let $\alpha:[0,1] \to GL_n^+$ be the straight line path between them. Then:

$$ \det(\alpha(t))=\det(A+t(B-A)) $$ is a polynomial in $t$ of degree $\le n$ . Hence, it has only finitely many zeroes. This implies there are no more than $n$ points $t_i$ where $\alpha(t_i)$ is not invertible.

Hence, we only need to show we can make arbitrary small perturbations around each such 'bad' non-invertible matrix. This would imply the Riemannian distance equals the Euclidean one.

It would be nice if someone could find a neat argument to show this maneuver is indeed possible.

Update: This conjecture is false. The key point (as noted by Jason DeVito and loup blanc) is that the determinant is negative for non-negligible parts of the straight-line path $\alpha$. Now, by continuity argument, any path which approximates too closely the straight path must enter a region of negative determinant.

It turns out that the behaviour depends on the number of sign changes of the determinant.

Example for a case where the distance is Euclidean (a "jump" is possible): take $n=2$,$A,B=\pm Id$. Start with a path from $Id$ to $\begin{bmatrix}\ep & 0 \\ 0 & \ep\end{bmatrix}$. Then go via

(1) $t \to \begin{bmatrix}\ep & -t \\ t & \ep\end{bmatrix}$ to $\begin{bmatrix}\ep & -\ep \\ \ep & \ep\end{bmatrix}$ ($t$ goes $0 \to \ep)$ .

(2) $t \to \begin{bmatrix}t & -\ep \\ \ep & t\end{bmatrix}$ to $\begin{bmatrix}-\ep & -\ep \\ \ep & -\ep\end{bmatrix}$ ($t$ goes $\ep \to -\ep)$.

(3) $t \to \begin{bmatrix}-\ep & -t \\ t & -\ep\end{bmatrix}$ to $\begin{bmatrix}-\ep & 0 \\ 0 & -\ep\end{bmatrix}$ ($t$ goes $\ep \to 0)$ .

Now continue with straight line until reaching $-Id$.

How much this maneuver cost us?

The derivatives of the 3 broken straight paths we took were: $\begin{bmatrix}0 & -1 \\ 1 & 0\end{bmatrix}, Id , \begin{bmatrix}0 & -1 \\ 1 & 0\end{bmatrix}$. their norms are $\sqrt 2$. Hence, the total lenght is $\sqrt2 \cdot 4\ep$ which is arbitrarily small, as required.

(Also, note that the determinant was always $t^2 + \ep^2 > 0$ so we stayed in $GL_n^+$).


(2) Can we compute explicitly for a given $A \in GL_n^+$, it's distance from $SO(n)$?

$dist(A,SO(n)) =\underset{X \in SO(n)}{\text{min}} d(A,X)$

($SO(n)$ is the special orthogonal group and the minimum exists since $SO(n)$ is compact and $d$ is continuous)

And who is the minimizer (the closest matrix to $A$ in $SO(n)$)? Is it unique?

Note: Right or Left multiplication by elements of $SO(n)$ are isometries of $GL_n^+$ with the induced metric. Thus, $d$ is left (right)-$SO(n)$ invariant.

In particular, if $ A = U\Sigma V^T $ is the SVD-dscomposition of $A$, then: $dist(A,SO(n)) = dist(\Sigma,SO(n)) $ , where $\Sigma$ is a square, diagonal matrix whose diagonal elements are the (strictly positive) singular values of $A$.

So for the question of computing the distance from $SO(n)$ (and the minimizer) is reduced to matrices of this type. (i.e diagonal + positive entries).

4

There are 4 best solutions below

8
On BEST ANSWER

Travis gives a good reference dated 2014; there is another dated 2011: cf. http://arxiv.org/abs/1109.0520

If $A,B\in GL_n^+(\mathbb{R})$, there is no closed form for $d(A,B)$. Note that one can study, in a same way, $GL_n(\mathbb{C})$.

  1. We consider the scalar product on $GL_n$: $<M,N>=tr(M^TN)$. It is left and right $O(n)$-invariant. We deduce a LEFT Riemannian metric defined in $Z\in GL_n$ by $g_Z(X,Y)=tr((Z^{-1}X)^TZ^{-1}Y)$.

  2. Consider a geodesic curve $X:[0,1]\rightarrow GL_n^+$; it is a solution of the second order ODE (1): $X'=XU,U'=[U^T,U]$ where $U\in C^1([0,1],M_n)$.

The locally unique solution of $\{(1),X(0)=X_0,X'(0)=X_0V_0\}$ is $X(t)=X_0e^{tV_0^T}e^{t(V_0-V_0^T)}$ and is defined on whole $\mathbb{R}$.

Note that the two exponentials simplify iff $V_0$ is a normal matrix; in this case $X(t)=X_0e^{tV_0}$.

  1. The length of the sub-geodesic $X([0,t_0])$ is $t_0||V_0||$. Clearly, we define $d(A,B)$ as the inferior bound of the lengthes of geodesic curves linking $A$ and $B$. In fact, there exists a geodesic that realizes this bound; this geodesic distance is a metric on $GL_n^+$; yet, in general, we do not know any explicit form of this curve. Small consolation: there is an explicit formula for $n=1$: $d(u,v)=|\log(u/v)|$. Note that, if $V_0$ is normal, then $d(X_0,X_0e^{V_0})\leq ||V_0||$; thus, if $N$ is normal, then $d(X_0,X_0N)\leq ||\log(N)||$, where $\log(.)$ is the principal logarithm and $N$ has no eigenvalues in $(-\infty,0]$.

Finally the previous distance is right invariant under rotations and left invariant with respect to action of $GL_n^+$.

I think that you can deduce the geodesic distance of $A\in GL_n^+$ to $SO_n$.

EDIT 1. Of course, the previous sentence is a joke. Yet, there is an easy instance: let $k>0$. Then $d(X_0,kX_0)=d(I_n,kI_n)=\sqrt{n}|\log(k)|$. Consequently $d(kI_n,SO(n))=\sqrt{n}|\log(k)|$.

EDIT 2. Answer to Asaf. OK, I understand. If we choose the metric $g_Z(X,Y)=tr((Z^{-1}X)^TZ^{-1}Y)$ (Riemannian manifold $\mathcal{R}$), then when $X(t)$ goes in the direction of a non-invertible matrix $S$, the path expands increasingly (due to the factor $Z^{-1}$) and the time to reach $S$ is infinite. If the chosen metric is $g_Z(X,Y)=tr(X^TY)$ (Riemannian manifold $\mathcal{E}$), then this phenomenon does not occur.

$\mathcal{R}$ has the advantage of being geodesically complete and, consequently, complete as a metric space and although $\mathcal{R}$ is not compact, any two points $A$ and $B$ can be connected with a geodesic whose length is $d(A,B)$. Yet, there may be several such geodesics and the result above: $d(I_n,kI_n)=\sqrt{n}|\log(k)|$ must be proved (intuitively, I am sure that it is true). The advantage of $\mathcal{E}$: the geodesic curves are very simple and, from any matrix $A$, we can aim at any matrix $B$ (yet we can meet a wall in a finite time).

EDIT 3. The answer to your question is NO. Take $A=\begin{pmatrix}-1/2&-1/4\\-1/2&-1/2\end{pmatrix},B=\begin{pmatrix}3/2&3/4\\1/2&1/2\end{pmatrix}\in GL_2^+$; then $\det(A+t(B-A))=(t-1/4)(t-1/2)$. The geodesic $X(t)$ meets twice ($t=1/2,3/4$) the algebraic cone $\det(U)=0$ in two points $P,Q\notin GL_n$ that are distinct from the apex $0$ of the cone, with -each time- a change of the signum of $\det(X(t))$. The direction of the geodesic $[2,1,1,1]$ is not tangent to the cone in $P,Q$. Then, these intersections between the geodesic (dimension $1$) and the cone (dimension $3$) are transversal in a vector space of dimension $3+1=4$. Then a small perturbation of $X(t)$ does not allow to cross to the other side of the mirror.

This impossibility occurs when the function $t\in (0,1)\rightarrow\det(A+t(B-A))$ have at least two changes of signum.

EDIT 4. Proposition. Let $\Sigma=diag((\sigma_i))$ as in the Asaf's note. Then, for the Euclidean metric on $GL_n^+$, $d(\Sigma,SO_n)=d(\Sigma,I_n)$, the $\min$ being reached only for $I_n\in SO_n$.

Proof. If $B\in SO_n$, then $||B-A||_F^2=n+tr(\Sigma^2)-2\sum_i\sigma_i b_{i,i}$. Thus we seek $\sup_{B\in SO_n}(\sum_i\sigma_i b_{i,i})$. Clearly the sup is obtained for a SOLE point, $B=I_n$. Note that $[\Sigma,I_n]\subset GL_n^+$ and we are done.

7
On

For most matrices $A, B$, the distance is $\sqrt{\sum_{ij} (a_{ij} - b_{ij})^2}$, because that's the distance in $R^{n^2}$, and we're inside an open subset. The problem is that sometimes the straight line between the matrices falls outside the subset.

Consider the case $n = 2$, and look at the matrices $\pm I$. I'm not even certain how to define the distance in this case -- as the inf of arclength over all paths between them? In that case, the original formula still works, for by an arbitrary small perturbation, you can avoid the zero matrix.

As for uniqueness: no.

Look at $GL(2)$, with $SO(2)$ inside it. Use coordinates $$ \begin{bmatrix} x & y \\ z & w \end{bmatrix} $$ The points of $SO(2)$ lie in the subspace where $x = w$ and $y = -z$. Take a vector orthogonal to this subspace, such as $$ \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix} $$ The distance from this to any point of $SO(2)$ will be the same. In particular, the squared distance to $$ \begin{bmatrix} c & s \\ -s & c \end{bmatrix} $$ where $c$ and $s$ are the sine and cosine of some angle, is \begin{align} d^2 &= (1-c)^2 + 2s^2 + (-1 - c)^2\\ &= 1 - 2c + c^2 + 2s^2 + 1 + 2c + c^2\\ &= 2 + c^2 + 2s^2 + c^2\\ &= 2 + 2(c^2 + s^2)\\ &= 4. \end{align}

6
On

Unfortunately, I don't think your conjecture is correct. More specifically,

Consider $A = I \in Gl_2^+$ and $B = \begin{bmatrix} 1 & 6 \\ -2 & -7\end{bmatrix}$. Then $d_{Gl_2^+}(A,B) > d_{M_2}(A,B)$.

For simplicity, let's simultaneously scale $d_{Gl_2^+}$ and $d_{M_2}$ so that $d_{M_2}(A,B) = 1$.

Let $\gamma(t) = tA + (1-t) B$, which is a geodesic in $M_2$ between $A$ and $B$. The key point is not only that $\det(\gamma(t)) = 0$ when $t = 1/2$ and $t=5/6$, but that $\det(\gamma(t)) < 0$ when $t\in(1/2, 5/6)$. So, the idea is that for any curve $\alpha$ paramaterized by arc length, which connects $A$ and $B$, if the length of $\alpha$ is close to $1$, then $\alpha(t)$ must be close to $\gamma(t)$. Further, because $\gamma(t) $ has negative determinant if $t\in (1/2,5/6)$, if $\alpha(t)$ is close to $\gamma(t)$, then $\alpha(t)$ must have negative determinant for some $t$s. Thus, $\gamma$ can only be well approximated by curves which do not totally lie in $Gl_2^+$.

Now for more rigor. Suppose $\alpha$ is paramaterized by arc length and $\alpha$ connects $A$ and $B$, and that $\alpha$ has length $1+\epsilon$.

Let $t_0$ denote any element of $(0,1)$ and assume $\alpha(t_0)\neq \gamma(t_0)$. I claim that $d(t) = d(\alpha(t_0), \gamma(t_0))\leq \sqrt{2\epsilon + \epsilon^2}$.

Here's why: Draw a triangle and label the base $\gamma(t)$. Call the vertex at the top $\alpha(t_0)$ and mark $\gamma(t_0)$, a point on the base. Then $d(t_0)$ is measured by the line connecting $\alpha(t_0)$ and $\gamma(t_0)$. By the law of cosines, $d(\alpha(0) ,\alpha(t_0)) = \sqrt{t_0^2 + d(t_0)^2 - 2 t_0 d(t_0)\cos\theta}.$ But $d(\alpha(0),\alpha(t_0))\leq t_0$ since $\alpha$ is paramaterized by arc length. In particular, this implies $\theta \leq \pi/2$.

Now, $d(\alpha(t_0), \alpha(1+\epsilon) \leq 1+ \epsilon - t_0$, but it's also given by the law of cosines as $d(\alpha(t_0), \alpha(1+\epsilon)) = \sqrt{(1-t_0)^2 + d(t_0)^2 - 2(1-t_0) d(t_0) \cos(\pi-\theta)}$, and thus, $$\sqrt{(1-t_0)^2 + d(t_0)^2 - 2(1-t_0) d(t_0) \cos(\pi-\theta)} \leq 1 - t_0 + \epsilon.$$

Squaring both sides and simplifying gives $$d^2 - 2(1-t_0)d\cos(\pi-\theta) \leq 2\epsilon(1-t_0) + \epsilon^2.$$

But $-2(1-t_0)d\cos(\pi-\theta) = 2(1-t_0)d \cos(\theta) \geq 0$ because $\theta \leq \pi/2$ and $t_0\leq 1$. Thus, $d^2 \leq d^2 - 2(1-t_0)d\cos(\pi-\theta)$.

It follows that $d^2 \leq 2\epsilon(1-t_0) + \epsilon^2$. But the largest $1-t_0$ can be is $1$, so we get $d^2 \leq 2\epsilon + \epsilon^2$. That is, $d \leq \sqrt{2\epsilon + \epsilon^2}$.

We now finish off the proof. Specialize $t_0$ to your favorite $t_0\in(1/2,5/6)$, so $\det(\gamma(t_0))< 0 $. Because $Gl_2^-$ is an open subset of $M_2$, there is a number $\delta$ with the property that the ball $B(\gamma(t_0), \delta)$ is completelely contained in $Gl_2^-$. Set $\epsilon = \sqrt{1+\delta}-1 > 0$, so $\sqrt{\epsilon^2 + 2\epsilon} = \delta$.

I claim the follwoing:

Suppose $\alpha(t)$ is paramaterized by arc length and connects $A$ and $B$. If the length of $\alpha$ is smaller than $1+\epsilon$ (with $\epsilon$ given above), then $\alpha(t_0)$ has negative determinant. In particular, $d_{Gl_2^+}(A,B) \geq 1+\epsilon$.

Proof: By our previous discussion, $d(\alpha(t_0), \gamma(t_0))\leq \sqrt{\epsilon^2 + 2\epsilon} = \delta$. In particular, $\alpha(t_0)\in B(\gamma(t_0),\delta)\subseteq Gl_2^-$, so $\alpha(t_0)$ has negative determinant.

4
On

"Conjecture: The Riemannian distance equals the Euclidean one."

Not possible. It would be equivalent to real matrices with Det $\geq 0$ forming a convex set, which they don't.

You can't just argue that the Det=0 locus is polynomial, which has finite intersection with any line. The intersections can be avoided with complex coordinates, but not for a polynomial hypersurface in $R^m$, which has codimension 1 and separates the space into components. Entire intervals of the line between points may be excluded. Geodesics between points near the hypersurface may have to crawl along the hypersurface, or take some more complicated path.