why $X(X^TAX)^{-1}X^T=A^+$ if and only if $N(X^T)=N(A^T)$?

81 Views Asked by At

Here $A$ is a non-invertible symmetric matrix but $(X^TAX)$ is invertible, $A^{+}$ represents the pseudo-inverse of $A$. $N(X^T)$ and $N(A^T)$ denote left null space of $X$ and $A$ respectively.

Now, how to show $X(X^TAX)^{-1}X^T=A^+$ if and only if $N(X^T)=N(A^T)$?

2

There are 2 best solutions below

0
On BEST ANSWER

Let $A$ be $n\times n$. Since $X^TAX$ is supposed to be invertible, it follows that $X$ is $n\times m$ with $m < n$ and $X$ is injective. In turn, $X^T$ is surjective. Note also that $\mathbb R^n = N(A)\oplus R(A)$ as $A$ is symmetric. $R(A)$ is invariant under $A$. Let $B := A|_{R(A)}$. Then $B$ is an invertible linear map and $$ A^+ = B^{-1}P_{R(A)}, $$ where $P_{R(A)}$ is the orthogonal projection onto $R(A)$. In particular, $N(A^+) = N(A)$.

(a) Let $A^+ = X(X^TAX)^{-1}X^T$. This implies $N(X^T)\subset N(A^+) = N(A)$. Now, if $x\in N(A)$, then $X(X^TAX)^{-1}X^Tx = 0$. Hence, for all $u\in\mathbb R^n$ we have $$ 0 = \langle X(X^TAX)^{-1}X^Tx,u\rangle = \langle (X^TAX)^{-1}X^Tx,X^Tu\rangle. $$ As $X^T$ is surjective, it follows that $(X^TAX)^{-1}X^Tx = 0$ and thus $x\in N(X^T)$.

(b) Assume that $N(X^T) = N(A)$ (and thus $R(X) = R(A)$) and put $L := X(X^TAX)^{-1}X^T$. We have to show that $L = A^+$. For this, let us first prove that $ALA = A$. For $y\in R(X)$, $y = Xx$, we have $$ ALAy = AX(X^TAX)^{-1}X^TAXx = AXx = Ay. $$ As $ALAy = Ay$ for $y\in N(A)$ holds trivially, we have proved $ALA = A$. This also gives $ALz = z$ for $z\in R(A)$. And since $ALz = 0$ for $z\in N(A) = N(X^T)\subset N(L)$, we have that $AL = P_{R(A)}$. In particular, $AL$ is symmetric. As also $L$ is symmetric, this gives $LA = (AL)^T = AL$. Thus, also $LA$ is symmetric and $$ LAL = X(X^TAX)^{-1}X^TAX(X^TAX)^{-1}X^T = X(X^TAX)^{-1}X^T = L. $$ This proves that $L = A^+$ (see https://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_inverse#Definition).

0
On

(partial answer) I'll formulate the problem in terms of linear transformations to give an alternative view and show $ “\implies” $ We will see that the matrix $ X $ gives us the means by which to “diagonalise” $ A $.

Let

  • $E,V$ be inner-product spaces,
  • $x : E \to V$ and $x^* : V \to E$ its transpose (=adjoint operator),
  • $a : V \to V $.

The assumption “$ x^*ax $ is invertible” tells us that

  1. $ x : E \to \mathrm{im}(x) $ is an isomorphism, because $ \ker(x) = 0 $, otherwise $ \ker(x^*ax) \neq 0 $ would hold.

Further, the assumptions $\ker(x^*) = \ker(a^*)$ and $ a = a^* $ allow us to infer

  1. $ \ker(a) = \ker(a^*) = \ker(x^*) = \mathrm{im}(x)^\perp $
  2. $ \mathrm{im}(a) = \ker(a^*)^\perp = \ker(x^*)^\perp = \mathrm{im}(x) $

This has the consequence that $ a : V \to V $ decomposes into a map between direct sums:

\begin{equation*} a = \begin{pmatrix} \hat{a} & 0 \\ 0 & 0 \end{pmatrix} : \mathrm{im}(x)+\mathrm{im}(x)^\perp \to \mathrm{im}(x)+\mathrm{im}(x)^\perp \end{equation*}

where $ \hat{a} : \mathrm{im}(x) \to \mathrm{im}(x) $ is invertible. Similarly for $ x $ and $ x^* $:

\begin{equation*} x = \begin{pmatrix} \hat{x} \\ 0 \end{pmatrix} : E \to \mathrm{im}(x) + \mathrm{im}(x)^\perp \quad x^* = \begin{pmatrix} \hat{x}^* & 0 \end{pmatrix} : \mathrm{im}(x) + \mathrm{im}(x)^\perp \to E \end{equation*}

Finally

\begin{align*} x(x^*ax)^{-1}x^* =& x(\hat{x}^*\hat{a}\hat{x})^{-1}x^* \\=& \begin{pmatrix} \hat{x} \\ 0 \end{pmatrix} (\hat{x}^{-1}\hat{a}^{-1}(\hat{x}^*)^{-1}) \begin{pmatrix} \hat{x}^* & 0 \end{pmatrix} \\=& \begin{pmatrix}\hat{a}^{-1} & 0 \\ 0 & 0 \end{pmatrix} = a^+ \end{align*}