Does $HOX = OX$ imply $HO^TX = O^TX$?

73 Views Asked by At

Suppose $X \in \mathbb{R}^{n \times p}$ has rank $p$, and let $H = X(X^TX)^{-1}X^T$ be the corresponding projection matrix. If there is an orthogonal matrix $O \in \mathbb{R}^{n \times n}$ such that $HOX = OX$, I am wondering if its inverse satisfies this identity as well, i.e.,

$$HO^TX = O^TX. \tag{*}$$

This problem stems from a statement from a statistics text book, claiming that under the conditions listed above, the set of orthogonal matrices

$$\mathcal{G} = \{O \in \mathcal{O}: HOX = OX\}$$

forms a group (I am assuming the group operation is naturally defined as matrix multiplication). To verify this, it is necessary to show $(*)$ holds so that every member in $\mathcal{G}$ has its inversion element also in $\mathcal{G}$. This seemingly trivial fact turns out quite challenging to me. Is it really technical or might $\mathcal{G}$ fail to be a group?

3

There are 3 best solutions below

0
On BEST ANSWER

Phrased like that, it looks hard. But one can see that $$\tag1 \mathcal G=\{O:\ O\mathcal X=\mathcal X\}, $$ where $\mathcal X$ is the range of $X$.

Now the equality $O\mathcal X=\mathcal X$ is the same as $O^T\mathcal X=\mathcal X$, and if $U,V\in\mathcal G$ then $UV\mathcal X=U\mathcal X=\mathcal X$. So $\mathcal G$ is a group.


Proof of $(1)$

If $O\mathcal X=\mathcal X$, then for any $v\in\mathbb R^p$, there exists $w\in\mathbb R^p$ with $OXv=Xw$; as $HX=X$, we get $HOXv=HXw=Xw=OXv$. As this works for all $v$, we get $HOX=OX$.

Conversely, if $HOX=OX$, then for any $v\in\mathbb R^p$ we have $OXv=HOXv\in\mathcal X$. So $O\mathcal X\subset \mathcal X$. As $O$ preserves dimension (being injective), $O\mathcal X=\mathcal X$.

0
On

Note that $H=X(X^TX)^{-1}X^T$ is the orthogonal projection to the column space of $X$, $\mathrm{ran}(X)=\{Xu:u\in\Bbb R^p\}$:

  • $HXu=X(X^TX)^{-1}X^TXu=Xu$
  • If $v\perp\mathrm{ran}(X)$, then $X^Tv=0$, implying $Hv=0$.

Conversely, if $Hv=v$, then it's already in the projected subspace: it follows that $v\in\mathrm{ran}(X)$, and similarly $Hv=0$ implies $v\perp\mathrm{ran}(X)$.

So, $HOX=OX$ means that $HOx_i=Ox_i$ for all columns $x_i$ of $X$, that is, $Ox_i\in\mathrm{ran}(X)$: we'll get that $O$ keeps $\mathrm{ran}(X)$ invariant. Certainly, then $O^{-1}$ also does so:
For each column, $O^{-1}x_i\in\mathrm{ran}(X)$ hence $HO^{-1}x_i=O^{-1}x_i$ by the above.

0
On

Here I will rewrite Martin's answer for the record, using the notations that I am more familiar with and add some more details.

Let $\textrm{Im}(A)$ and $\textrm{Ker}(A)$ denote the range (or image) space and kernel space of any linear mapping (or matrix) $A$ respectively. Define \begin{align} \mathcal{G}' = \{O \in \mathcal{O}: \textrm{Im}(OX) = \textrm{Im}(X)\} \end{align}

We want to show that $\mathcal{G} = \mathcal{G}'$ and if $O \in \mathcal{G}'$, then $O^T \in \mathcal{G}'$, hence completing the proof.

First we show $\mathcal{G} = \mathcal{G}'$.

If $O \in \mathcal{G}'$, then for any $v \in \mathbb{R}^p$, there exists $w \in \mathbb{R}^p$ such that $OXv = Xw$, therefore $$HOXv = HXw = Xw = OXv, $$ as this holds for all $v$, we conclude $HOX = OX$, i.e., $O \in \mathcal{G}$.

Conversely, if $O \in \mathcal{G}$, then for any $v \in \mathbb{R}^p$, $OXv = HOXv = X(X^TX)^{-1}X^TOXv \in \textrm{Im}(X)$, implying $\textrm{Im}(OX) \subset \text{Im}(X)$. On the other hand, $\textrm{Ker}(OX) = \textrm{Ker}(X)$ (this is what Martin means "$O$ being injective") and the range-kernel dimensional theorem assert that $\dim(\textrm{Im}(OX)) = \dim(\textrm{Im}(X))$. Together with $\textrm{Im}(OX) \subset \text{Im}(X)$, it follows that $\textrm{Im}(OX) = \text{Im}(X)$, i.e., $O \in \mathcal{G}'$.

Next we show if $O \in \mathcal{G}'$, then $O^T \in \mathcal{G}'$.

For any $v \in \mathbb{R}^p$, $Xv \in \textrm{Im}(X) = \textrm{Im}(OX)$ implies there exists $w \in \mathbb{R}^p$, such that $Xv = OXw$, hence $O^TXv = Xw \in \textrm{Im}(X)$, i.e., $\textrm{Im}(O^TX) \subset \textrm{Im}(X)$.

Conversely, for any $v \in \mathbb{R}^p$, $OXv \in \textrm{Im}(OX) = \textrm{Im}(X)$ implies there exists $w \in \mathbb{R}^p$, such that $OXv = Xw$, hence $Xv = O^TXw \in \textrm{Im}(O^TX)$, i.e., $\textrm{Im}(X) \subset \textrm{Im}(O^TX)$.