Generalized inverse/Pseudo Inverse

1k Views Asked by At

Let $A_{m. n}$ be a matrix with rank $p$ where $p\leq m$ and $p\leq n$.

First Question: We need to show that $A$ can be decomposed as a product of two matrices $A=BC$ where $B$ is an $m$ by $p$ and $C$ is an $p$ by $n$ and both $B$ and $C$ have rank $p$.

Second question: We define the generalized inverse of $A$ by $A^{+}=C^{T}(CC^{T})^{-1}(B^{T}B)^{-1}B^{T}$. Show that the solution to the equations: $Ax=b$ (in case it exists) is given by: $x=A^{+}b+e$ where $Ae=0$ and that the solution that has the smallest norm is given by: $x=A^{+}b$.

For the first question, I tried to decompose $A$ using the generalized QR decomposition where $Q$ has $p$ linearly independent columns and $R$ has all entries below the main diagonal equal to $0$. The issue is I don't know how to prove such decomposition in the case of this problem because not all columns of $A$ are linearly independent.

If anyone has another way to decompose $A$ like required by the statement of the problem, and that makes solving part 2 easy, please let me know.

For part 2, I have no clue how to do it, so any help is appreciated. Thanks!

1

There are 1 best solutions below

0
On BEST ANSWER

Ad 1) Of course one can use the QR factorisation here, but it is not necessary. Since the rank of $A=[a_1,\ldots,a_n]$ is $p$, there is a basis $B=[b_1,\ldots,b_p]\in\mathbb{R}^{m\times p}$ of the range of $A$, that is, the span of the columns $a_1,\ldots,a_n$. Every column of $A$ can be expressed by a linear combination of the columns of $B$ (since $B$ is a basis of the range of $A$) and hence $a_i=Bc_i$ for some vector $c_i\in\mathbb{R}^p$. Stacking the columns together gives $A=BC$ with $C=[c_1,\ldots,c_n]\in\mathbb{R}^{p\times n}$. Since the columns of $B$ form a basis of the range of $A$, the rank of $B$ is $p$. The rank of $A=BC$ is bounded by the minimum of the ranks of $B$ and $C$ and hence the rank of $C$ cannot be smaller than $p$ and so the rank of $C\in\mathbb{R}^{p\times n}$ is $p$ as well.

Note that $B$ is a basis of the range of $A$ while $C$ is a basis of the range of $A^T$. Obviously, the bases and neither this decomposition is unique (up to the spaces they span). If $A=BC$ and $M$ is any invertible $p\times p$ matrix, then $A=\tilde{B}\tilde{C}$ with $\tilde{B}=BM$, $\tilde{C}=M^{-1}C$.

Ad 2) The matrix $A^+$ defined above is the Moore-Penrose pseudo-inverse of $A$. You can easily verify the four Penrose conditions $AA^+A=A$, $A^+AA^+=A^+$, $(AA^+)^T=AA^+$, and $(A^+A)^T=A^+A$. Note that the matrices $A^+A$ and $I-A^+A$ are orthogonal projectors (idempotent and symmetric, actually they project onto the range of $A^T$ and the nullspace of $A$, respectively as you can verify, e.g., using the $BC$ factorization).

Let $x$ be a solution of $Ax=b$. The solution $x$ can be decomposed as $x=A^+Ax+(I-A^+A)x$. Since we have $Ax=b$, $x=A^+b+e$ with $e=(I-A^+A)x$ and $Ae=A(I-A^+A)x=Ax-AA^+Ax=Ax-Ax=0$.

From the fact that $A^+A$ and $I-A^+A$ are orthogonal projectors we have that $A^+b$ and $e$ are orthogonal and hence a solution $x=A^+b+e$ satisfies $\|x\|_2^2=\|A^+b\|_2^2+\|e\|_2^2$. The minimum is attained when $e=0$.