From both this paper and Wikipedia, it is mentioned that for an orthogonal projection matrix $(I - A^+A)$ its pseudo inverse is itself, i.e., $$(I - A^+A)^+ = I - A^+A$$ Why is this the case? Can someone please help me understand how this can be proved?
Why is the pseudoinverse of an orthogonal projection matrix itself?
2k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
When thinking about the pseudoinverse, I often find that the most illuminating viewpoint is to start with the geometric definition of the pseudoinverse of a matrix $A \in \mathbb R^{n \times m}$.
The pseudoinverse of $A$ takes as input a vector $b \in \mathbb R^n$ and returns as output the vector $x$ of least norm which satisfies $Ax = \tilde b$, where $\tilde b$ is the orthogonal projection of $b$ onto $R(A)$.
Suppose $P \in \mathbb R^{n \times n}$ is a matrix that performs orthogonal projection onto a subspace $W$. Let $b \in \mathbb R^n$. The pseudoinverse of $P$ maps $b$ to the least norm solution of $Px = \tilde b$, where $\tilde b$ is the projection of $b$ onto $R(P) = W$.
But what is the vector of least norm whose projection onto $W$ is $\tilde b$? It is the vector $\tilde b$ itself, of course. In other words, the least norm solution to $Px = \tilde b$ is $x = \tilde b = Pb$.
Therefore, the pseudoinverse of $P$ maps $b$ to $Pb$. This shows that the pseudoinverse of $P$ is $P$.
After a few months I think I am able to answer my own question, as @greg said the key is to use the four Penrose conditions:
Then if we see $A$ as $(I - A^\dagger A)$ and $M$ also as $(I - A^\dagger A)$ we can check whether they satisfy the above four conditions.
First, we need show that $(I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A}) = (I - \mathbf{A}^\dagger\mathbf{A})$, this can be shown via direct calculation
\begin{align*} (I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A}) &= I - \mathbf{A}^\dagger\mathbf{A} - \mathbf{A}^\dagger\mathbf{A} + \mathbf{A}^\dagger\mathbf{A}\mathbf{A}^\dagger\mathbf{A}\\ &= I - \mathbf{A}^\dagger\mathbf{A} - \mathbf{A}^\dagger\mathbf{A} + \mathbf{A}^\dagger\mathbf{A} & \mathbf{A}\mathbf{A}^\dagger\mathbf{A} = \mathbf{A}\\ &= I - \mathbf{A}^\dagger\mathbf{A}. \end{align*}
Then we can have \begin{align} (I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A}))^T &= (I - \mathbf{A}^\dagger\mathbf{A})^T = I^T - (\mathbf{A}^\dagger\mathbf{A})^T = I - \mathbf{A}^\dagger\mathbf{A}\\ ((I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A}))^T &= I - \mathbf{A}^\dagger\mathbf{A}\\ (I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A}) &= I - \mathbf{A}^\dagger\mathbf{A}\\ (I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A})(I - \mathbf{A}^\dagger\mathbf{A}) &= I - \mathbf{A}^\dagger\mathbf{A} \end{align}
Thus, we show that $(I - \mathbf{A}^\dagger\mathbf{A})^\dagger = (I - \mathbf{A}^\dagger\mathbf{A})$.