I'm working on an optimization problem on manifolds and I'm having a bit of a conceptual issue with choosing between the Grassmann and Stiefel manifolds. Grassmann(2, 3) is the linear subspace of dimension 2 within the space $\mathbb{R}^3$, so all planes through the origin. So a point on the manifold corresponds to a plane, invariant to linear mixing of support vectors. Stiefel(2, 3) would be all possible planes through the origin that are the span of two orthonormal vectors. So my questions are:
Does that mean that a point on the Stiefel manifold is also a point on the Grassmann manifold (in the sense that all linear subspaces of a given span can be reduced to the same orthonormal components)?
What do I gain by optimizing on the Grassmann vs the Stiefel manifold, if the parametrization of the points is the same?
To understand how to choose between Grassmann and Stiefel manifolds, it helps to understand better the difference. In the following I will only look at the case of real vector space $\mathbb{R}^n$. Much information can be found at https://en.wikipedia.org/wiki/Stiefel_manifold and https://en.wikipedia.org/wiki/Grassmannian.
The Grassmannian $Gr(k,n)$ is the (compact) manifold of all $k$-dimensional linear subspaces in $\mathbb{R}^n$. So the one-dimensional case $k=1$ is real projective space. The Stiefel manifold $V_k(n)$ is (compact) manifold of all (orthogonal) $k$-frames (hereafter we leave out "orthogonal" which is always implied). A $k$-frame is a set of $k$ orthonormal vectors in $\mathbb{R}^n$, so can alternatively be described as the manifold of $n\times k$ column orthogonal matrices.
Let us compare the definitions for a few low-dim cases:
- $Gr(1,n)$ and $V_1(n)$
$Gr(1,n)$ is the set of all line through the origin, while $V_1(n)$ is the set of all unit vectors, so is the unit sphere. To each point in $Gr(1,n)$ there corresponds two (antipodal) unit vectors in $V_1(n)$.
- $Gr(2,n)$ and $V_2(n)$
$Gr(2,n)$ is the set of all 2-planes (through the origin) while $V_2(n)$ is all 2-frames in $\mathbb{R}^n$. To each point in $Gr(2,n)$ (that is, plane) there corresponds in $V_2(n)$ the set of all orthogonal bases for that plane.
-
General case
There is a natural projection $p\colon V_r(n) \mapsto Gr(r,n)$ which to each frame in $V_r(n)$ sends it to the plane in $Gr(r,n)$ of which it is a basis. In this way, we can define $Gr(r,n)$ as a quotient space of $V_r(n)$.
What does all this mean for your optimization problem? If you are only interested in linear subspaces, use $Gr(r,n)$, but if how you parametrize the space with an orthogonal basis is important, use the Stiefel manifold.
But, algorithmically, it could be easier to represent a Stiefel manifold, so you can use that but build into the step-finding algorithm that you avoid walking to a new frame which is a basis for the same subspace. See Optimization Algorithms on Matrix Manifolds