Tangent Space of orthogonal matrix

3.2k Views Asked by At

I want to prove that $$T_IO(n,\mathbb{R}) = \{X\in \mathbb{R}^{n\times n} : X=-X^T\}$$

My only trouble is with that whole dropping second order terms business.

So I know that I want to define a curve through the identity in such a way that the tangent to the curve is $X$, that is $$\gamma = \{A(t)=I+tX : t\in(-\epsilon, \epsilon) \}$$ Now $A(0)=I$ and $\dot{A}(0)=X$ so $X\in T_IO(n,\mathbb{R})$. Moreover $$I = A^TA = (I+tX)^T(I+tX)= I + tX + tX^T + t^2 X^TX $$

Now if you drop $t^2 X^TX$ you have the answer, but why do you drop it? What's the rational?

Is it because $$I + tX + tX^T + t^2 X^TX \to I + tX + tX^T$$ as $\epsilon\to 0$? Is that all?

EDIT: I'm also not sure about the converse. Why can you always build $\gamma$ from a skew-symmetric matrix $X$?

2

There are 2 best solutions below

3
On BEST ANSWER

Here's my approach:

For $\epsilon > 0$, let

$J_\epsilon = \{ t \in \Bbb R \mid -\epsilon < t < \epsilon \} = (-\epsilon, \epsilon), \tag 1$

and let

$\alpha:J_\epsilon \to O(n, \Bbb R) \tag 2$

be a differentiable curve in $O(n, \Bbb R)$ with

$\alpha(0) = I \in O(n, \Bbb R); \tag 3$

then we have

$\alpha^T(t) \alpha(t) = I, \; t \in J_\epsilon; \tag 4$

differentiating (4) at any $t \in J_\epsilon$ yields

$(\alpha^T(t))' \alpha(t) + \alpha^T(t) \alpha'(t) = 0; \tag 5$

if we set $t = 0$ we find

$(\alpha^T(0))' \alpha(0) + \alpha^T(0) \alpha'(0) = 0; \tag 5$

now by virtue of (3) we have

$(\alpha^T(0))' I + I \alpha'(0) = 0, \tag 6$

or

$(\alpha^T(0))' + \alpha'(0) = 0; \tag 7$

now

$(\alpha^T(t))' = (\alpha'(t))^T, \; t \in J_\epsilon, \tag 8$

as is easy to see by simply differentiating and transposing the matrix

$\alpha(t) = [\alpha_{ij}(t)]; \tag 9$

it's so easy to see, in fact, that I will leave the details to my readers. Given that (8) binds, (7) becomes

$(\alpha'(0))^T + \alpha'(0) = 0, \tag{10}$

or

$(\alpha'(0))^T = -\alpha'(0), \tag{11}$

which shows that

$T_IO(n,\mathbb{R}) \subset \{X\in \mathbb{R}^{n\times n} : X=-X^T\}; \tag{12}$

now if $\beta$ is any fixed $n \times n$ skew-symmetric real matrix, that is,

$\beta \in \{X\in \mathbb{R}^{n\times n} : X=-X^T\}, \tag{13}$

then

$e^{ 0 \beta} = e^0 = I, \tag{14}$

and

$(e^{t\beta})^T e^{t\beta} = e^{t\beta^T} e^{t\beta} = e^{-t\beta}e^{t\beta} = e^{t(-\beta + \beta)} = e^0 = I, \tag{15}$

which shows that $e^{t\beta}$ is a path in $O(n, \Bbb R)$; furthermore

$(e^{t\beta})' = \beta e^{t\beta}, \tag{16}$

which shows that

$(e^{t\beta})'\mid_{t = 0} = \beta e^{0 \beta} = \beta I = \beta, \tag{17}$

i.e.,

$\beta \in T_IO(n, \Bbb R); \tag{18}$

thus, in addition to (12) we see that

$ \{X\in \mathbb{R}^{n\times n} : X=-X^T\} \subset T_IO(n,\mathbb{R}); \tag{19}$

therefore,

$T_IO(n,\mathbb{R}) = \{X\in \mathbb{R}^{n\times n} : X=-X^T\}, \tag{20}$

as was to be shown.

What I like about this way of doing things is that $\alpha(t)$ is actually in $O(n, \Bbb R)$ for $t \in J_\epsilon$, a fact which obviates the need to address the issue that for $X \ne 0$, $I + tX$ is not really a member of $O(n, \Bbb R)$; indeed, if it were true that

$I + t(X + X^T) + t^2 X^TX = I, \tag{21}$

then

$t(X + X^T) + t^2 X^TX = 0 \tag{22}$

we differentiate and find

$X + X^T + 2tX^TX = 0; \tag{23}$

when $t = 0$ we find

$X = -X^T; \tag{24}$

so far, so good; but if we differentiate again we obtain

$2X^TX = 0, \tag{25}$

or

$X^TX = 0, \tag{26}$

which implies

$X = 0 \tag{27}$

since then for $x \in \Bbb R^n$,

$\langle Xx, Xx \rangle \Longrightarrow \langle x, X^TX \rangle = 0 \Longrightarrow Xx = 0 \Longrightarrow X = 0. \tag{28}$

The remedy here is to realize that in fact $I + tX \notin O(n, \Bbb R)$, and to assume that we can in fact write

$\displaystyle \sum_0^\infty X_k t^k \in O(n, \Bbb R), \; X_0 = I, \tag{29}$

and work with this infinite series:

$\left ( \displaystyle \sum_0^\infty X_k t^k \right )^T \displaystyle \sum_0^\infty X_k t^k = I, \tag{30}$

which yields

$\left ( \displaystyle \sum_0^\infty X_k^T t^k \right )\displaystyle \sum_0^\infty X_k t^k = I, \tag{31}$

and multiplying the series out:

$I + (X_1^T + X_1)t + (X_2^T + X_2 + X_1^TX_1)t^2$ $+ (X_3^T + X_3 + X_1^TX_2 + X_2^T X_1)t^3 + \ldots = I, \tag{32}$

and we find from this equation that

$X_1^T + X_1 = 0. \tag{33}$

0
On

Let me prove a bit more general statement:

Let $G$ be a matrix group defined by bilinear form $B\colon V\times V\to \mathbb R$, i.e. $$G = \{ g\in \operatorname{GL}(V)\, | \, B(gv,gw) = B(v,w),\ \forall v,w\in V \}.$$

Then the Lie aglebra $\mathfrak g$ of $G$ is given by $$\mathfrak g=\{ X\in\mathfrak{gl}(V)\, \mid \, B(Xv,w) + B(v,Xw) = 0,\ \forall v,w\in V \}.\tag{1}$$

We can also write it more conveniently as $$G = \{ g\in\operatorname{GL}(V)\,\mid\, g^tBg = B\}$$ and $$\mathfrak g = \{ X\in\mathfrak{gl}(V)\,\mid\, X^tB + BX = 0\}$$ where $B$ is the matrix of the bilinear form $B$ (yeah, yeah, I use the same notation for both), i.e. $B(v,w) = v^tBw$. Your statement is obvious from this since the orthogonal group is given as the group that preserves inner product, so we can take $B = I$.


Now, to the proof. Let $G$ and $B$ as above.

It is a known theorem that states that the Lie algebra of a closed matrix group $G$ is given by $$\mathfrak g = \{ X\in\mathfrak{gl}(V)\,\mid\, \exp(tX)\in G,\ \forall t\in\mathbb R\}\tag{2}$$ and we want to show that both $(1)$ and $(2)$ describe the same set (I'll use $\mathfrak g_{1}$ and $\mathfrak g_{2}$ to distinguish them for now).

For $X\in\mathfrak{gl}(V)$ and $v,w\in V$ define function $f\colon\mathbb R\to \mathbb R$ with $f(t) = B(e^{tX}v,e^{tX}w)$. Notice that because $B$ is bilinear, we can use Leibniz rule to find derivative of $f$:

$$\left.\frac{d}{dt}f(t)\right|_{t=0} = \left.\frac{d}{dt}B(e^{tX}v,e^{tX}w)\right|_{t=0} = B(Xv,w)+B(v,Xw).\tag{3}$$

Now, assume that $X\in\mathfrak g_2$. That means that $e^{tX}\in G$ for all $t$, so $f(t) = B(v,w)$ for all $t$. Thus, $f$ is constant, so it has zero derivative, implying $B(Xv,w) + B(v,Xw) = 0$ by $(3)$. Since $v,w$ were arbitrary, this gives us $X\in\mathfrak g_1$.

Conversely, let $X\in\mathfrak g_1$. This means that $f$ has zero derivative by $(3)$ and so $f$ is constant. That means that for all $t$, we have $B(e^{tX}v,e^{tX}w) = B(v,w)$. Since $v,w$ were arbitrary, $e^{tX}\in G$ for all $t$, so $X\in\mathfrak g_2$.