Why is the pseudoinverse of an orthogonal projection matrix itself?

2k Views Asked by At

From both this paper and Wikipedia, it is mentioned that for an orthogonal projection matrix $(I - A^+A)$ its pseudo inverse is itself, i.e., $$(I - A^+A)^+ = I - A^+A$$ Why is this the case? Can someone please help me understand how this can be proved?

2

There are 2 best solutions below

0
On BEST ANSWER

After a few months I think I am able to answer my own question, as @greg said the key is to use the four Penrose conditions:

Consider a real linear operator $A: \mathbf{X}\rightarrow\mathbf{Y}$. A real linear operator $M: \mathbf{Y}\rightarrow\mathbf{X}$ is the unique pseudo-inverse of $A$, denoted as $M = A^\dagger$ if and only if it satisfies the four Moore-Penrose conditions: \begin{equation*} \mathrm{(i)}\ (AM)^T = AM,\ \ \mathrm{(ii)}\ (MA)^T = MA,\ \ \mathrm{(iii)}\ AMA = A,\ \ \mathrm{(iv)}\ MAM = M \end{equation*}

Then if we see $A$ as $(I - A^\dagger A)$ and $M$ also as $(I - A^\dagger A)$ we can check whether they satisfy the above four conditions.

First, we need show that $(I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A}) = (I - \mathbf{A}^\dagger\mathbf{A})$, this can be shown via direct calculation

\begin{align*} (I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A}) &= I - \mathbf{A}^\dagger\mathbf{A} - \mathbf{A}^\dagger\mathbf{A} + \mathbf{A}^\dagger\mathbf{A}\mathbf{A}^\dagger\mathbf{A}\\ &= I - \mathbf{A}^\dagger\mathbf{A} - \mathbf{A}^\dagger\mathbf{A} + \mathbf{A}^\dagger\mathbf{A} & \mathbf{A}\mathbf{A}^\dagger\mathbf{A} = \mathbf{A}\\ &= I - \mathbf{A}^\dagger\mathbf{A}. \end{align*}

Then we can have \begin{align} (I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A}))^T &= (I - \mathbf{A}^\dagger\mathbf{A})^T = I^T - (\mathbf{A}^\dagger\mathbf{A})^T = I - \mathbf{A}^\dagger\mathbf{A}\\ ((I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A}))^T &= I - \mathbf{A}^\dagger\mathbf{A}\\ (I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A}) &= I - \mathbf{A}^\dagger\mathbf{A}\\ (I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A}) &= I - \mathbf{A}^\dagger\mathbf{A} \end{align}

Thus, we show that $(I - \mathbf{A}^\dagger\mathbf{A})^\dagger = (I - \mathbf{A}^\dagger\mathbf{A})$.

0
On

When thinking about the pseudoinverse, I often find that the most illuminating viewpoint is to start with the geometric definition of the pseudoinverse of a matrix $A \in \mathbb R^{n \times m}$.

The pseudoinverse of $A$ takes as input a vector $b \in \mathbb R^n$ and returns as output the vector $x$ of least norm which satisfies $Ax = \tilde b$, where $\tilde b$ is the orthogonal projection of $b$ onto $R(A)$.

Suppose $P \in \mathbb R^{n \times n}$ is a matrix that performs orthogonal projection onto a subspace $W$. Let $b \in \mathbb R^n$. The pseudoinverse of $P$ maps $b$ to the least norm solution of $Px = \tilde b$, where $\tilde b$ is the projection of $b$ onto $R(P) = W$.

But what is the vector of least norm whose projection onto $W$ is $\tilde b$? It is the vector $\tilde b$ itself, of course. In other words, the least norm solution to $Px = \tilde b$ is $x = \tilde b = Pb$.

Therefore, the pseudoinverse of $P$ maps $b$ to $Pb$. This shows that the pseudoinverse of $P$ is $P$.