Can the Gram-Schmidt projection be thought of as finding the nearest point in the subspace of orthogonal matrices, with a suitable distance?

224 Views Asked by At

Let $A \in M_{n \times n}(\mathbb(R))$. If we apply Gram-Schmidt to $A$ we get a matrix $A' \in O_n(\mathbb{R})$. We could also try to solve the optimization problem $\min_{X \in O_n} d(A,X)$, where $d$ is some reasonable metric.

Question: Is there a choice of metric (or divergence) $d$ such that these two notions coincide?

Notes:

  • For the Frobenius norm, I know that if $A = U \Sigma V^T$ is the SVD, then the solution to the minimization problem is given by $ UV^T$ ( https://en.wikipedia.org/wiki/Orthogonal_Procrustes_problem ).

  • For the Frobenius distance minimization, the natural connection is to the polar decomposition: $S = U \Sigma U^T$ and $P = U V^T$, then $SP =U \Sigma U^T U V^T = U \Sigma V^T = A$. Alternatively, we could have used $L = V \Sigma V^T$, and $W = UV^T$, to write $A = WL$. So, the rotation part of the polar decomposition of $A$ is the Frobenius norm nearest point in $O(n)$.

  • If we write $A = QR$, where $Q$ is orthogonal and $R$ is non-negative upper triangular (the $QR$-decomposition), then this is unique and $A' = Q$. Based on this, we would like the metric to satisfy that $d(X, QR)$ is minimized by $X = Q$. Assuming $d$ is invariant under orthogonal transformations, this gives $d( Q^{-1} X, R)$, so we want $d(X,R)$ to be minimized by $X = I$ whenever $R$ is a non-negative upper triangular matrix and $X$ is orthogonal. If $\mathcal{R}_n$ denotes the non-negative upper triangular matrices, then $\mathcal{R}_n \cap O(n) = \{I\}$ (otherwise there would obviously be no such metric). It seems like we'd want a metric $d$ where $O(n)$ touches $\mathcal{R}_n$ "d-orthogonally" at $I$. The Frobenius norm does not work, since if $R = [A, [1,10],[0,1]]$ then $d_{Frob} (R,I) = 10$ but $d_{Frob} (R, [ [0,1],[1,0]]) = \sqrt{84}$. This calculation is not surprising because it is rare for $QR$ and polar decompositions to coincide -- that only happens if the PSD part (or the upper triangular part) is diagonal.

  • A reasonable thing to do next is to numerically try other common matrix norms. There is some discussion of rotation invariant norms here: Characterizing orthogonally-invariant norms on the space of matrices .

  • This https://arxiv.org/pdf/1601.01875.pdf seems sorta related.

1

There are 1 best solutions below

4
On

By applying GS to $A$ we get an orthonormal set $A'$ where the span of the first $i$ columns of $A$ and $A'$ are the same.

Naively you could do something like checking that the $i$-th column of $A$ is in the span of the first $i$ columns of $X$. That is, $$ d(A,X) = \sum_{i=1}^{n} \| (I - (X_{1:i})(X_{1:i})^T)A_i \| $$ where $A_i$ is the $i$th column of and $A$ and $X_{1:i}$ is the matrix of columns $1,2,\ldots,i$.

This isn't a metric in the "metric space" sense because it isn't symmetric, but you could symmetrize it. Then you would have to see if it is subadditive (maybe with the right norm it is). I'm not really sure this counts as "nice" though.

If you assume that $d$ is invariant under unitary transforms then you're just looking for an orthogonal matrix with the same span as $A$, so you can minimize,

$$ d(A,X) = \| AA^+ - XX^+\| = \|A(A^TA)^{-1}A^T - XX(X^TX)^{-1}X^T\| $$ where $A^+$ is the pseudoinverse. If $X$ is unitary, then $XX^+ = XX^T$.