Dimensionality not matching for differential of matrix with orthogonality constraints

44 Views Asked by At

I was reading through the following answer out of curiosity about calculating the differential of a matrix with orthogonality constraints. Briefly the mathematics works out as follows:

$ \text{Let } X \in \mathbb{R}^{m\times n}, \text{ so that } X^TX=I_{n\times n}, \text{ } m > n $

$ \text{Then }, d(X^TX)= dX^T X + X^T dX = 0 $

I originally saw no problem with that except then I noticed that: $dX^T X\in \mathbb{R}^{n\times n}$ and $dX X^T\in \mathbb{R}^{m\times m}$.

If these two matrices have different dimensions ... how is the addition over the matrices even defined? Or how should one interpret this at all?

1

There are 1 best solutions below

1
On

$\newcommand{\R}{\mathbb{R}}$ $dX\in \R^{m\times n}$, and it will be clearer if you use brackets: $$d(X^TX) = (dX)^TX + X^T(dX) = 0,$$ both terms are in $R^{n\times n}$. Equivalently, if $\Delta_X$ is tangent to the Stiefel manifold at $X$, differentiate the relation $X^TX=I$ in direction $\Delta_X$ shows $(\Delta_X)^TX + X^T(\Delta_X) = 0$, or $X^T\Delta_X$ is antisymmetric.