I am reading the paper
- Alan Edelman, Tomas A. Arias, Steven T. Smith, The geometry of algorithms with orthogonality constraints, SIAM Journal on Matrix Analysis and Applications, Volume 20, Number 2, 1998.
in which they consider the Stiefel manifold $M$ (all $n \times p$ matrices $X$ satisfying $X^T X = I_p$, i.e., with $p \leq n$ orthogonal columns).
This is considered as embedded in Euclidean space $Mat_{n\times p}=R^{np}$ with Euclidean metric $(X,Y)=tr(X^TY)$. At each point $X\in M$ we then have the tangent space $T_X(M)$ and normal space $N_X(M)=T_X(M)^\perp$ (orthogonal complement in $Mat_{n\times p}$). The formula for orthogonal projection $\pi_X$ onto the normal space $N_X$ at point $X\in M$ is given by $$\DeclareMathOperator{\sym}{sym} \pi_X(\Delta)=X \sym(X^T\Delta),\text{ where }\sym(A):=(A+A^T)/2. $$ (equation 2.3, p6). They then go on to develop a differential equation for parallel transport of a tangent vector $\Delta\in T_{X(0)}(M)$ along a curve $X(t)$ on $M$ (equation 2.16, p9). The intuition behind it is given as follows: let $\Delta(t)\in T_{X(t)}(M)$ be the tangent vector $\Delta$ parallel transported to the point $X(t)$ along the curve $X$. To first order, parallel transport of $\Delta(t)$ from $X(t)$ to $X(t+dt)$ consists in shifting the vector $\Delta(t)$ to the new base point $X(t+dt)\simeq X(t)+dt X'(t)$ and projecting onto the tangent space $T_{X(t+dt)}(M)$, that is removing the normal component $\pi_{X(t)+dt X'(t)}(\Delta(t))$. This would yield the equation $$ \Delta(t+dt)\simeq \Delta(t)-\pi_{X(t)+dt X'(t)}(\Delta(t)) $$ (to first order). They then give a cryptic remark how this can be done by differentiating the formula for projection onto the normal space resulting in equation 2.16, p9 of the paper. However if we follow the above equation and rewrite it as $$ \Delta(t+dt)-\Delta(t)\simeq -(X(t)+dt X'(t))\sym[(X(t)+dt X'(t))^T\Delta(t)] $$ using that (with $G=\Delta(t)$ fixed) the expression $E \sym(F^TG)$ is bilinear in $E$ and $F$, multiplying out accordingly, dropping the term containing $(dt)^2$ and observing that $$ X(t)\sym[X(t)^T\Delta(t)]=0 $$ since $\Delta(t)\in T_{X(t)}(M)$, we obtain (to first order) $$ \Delta(t+dt)-\Delta(t)\simeq -dt\left[ X(t)\sym(X'(t)^T\Delta(t))+X'(t)\sym(X(t)^T\Delta(t))\right] $$ implying the differential equation $$ \Delta'(t)=-X(t)\sym(X'(t)^T\Delta(t))-X'(t)\sym(X(t)^T\Delta(t)) $$ This differs from equation 2.16, p9 in the paper which only has the first summand on the right. What's wrong with this?
Since $M$ is defined by $X^TX=I_{p\times p}$, the tangent space at $X$ is defined as the set of all $A\in\mathbb{R}^{n\times p}$ such that $X^TA+A^TX=0$, i.e., $\operatorname{sym}(X^TA)=0$. So $\Delta$ satisfies $\operatorname{sym}(X(t)^T\Delta(t))=0$ for all $t$, hence the second term vanishes.