My question originates from a short proof in Strang's LA book that shows when
$A\pmb{x}=\pmb{b}$
every vecetor $\pmb{x_r}$ in the rowspace of $A$ projects uniquely to a single vector $\pmb{b_c}$ in the column space of $A$. The proof is discussed in this thread.
On that thread, many of the answers contradict eachother (for example, some say $A$ must be full rank, others not) and the case that $\pmb{x}$ is in neither the rowspace nor nullspace isn't discussed.
As a quick aside, I believe that proof only shows that the transformation $T_A|_{C(A^T)} = A\pmb{x_r}$ is an injective function but not that $\forall \pmb{b} \in C(A) \ \exists \ \pmb{x_r} \in C(A^T) : T_A(\pmb{x_r}) \rightarrow \pmb{b}$ i.e. it may or may not be bijective (any further on this would be of great interest, I would love to know how to prove).
I would like to try and fully characterise all possibilities here and to get advice on whether it is throught process is correct and clears up any confusion in the other thread.
The column space $C(A)$ and left-nullspace $N(A^T)$ of an $m$ x $n$ matrix $A$ are two subspaces which form an orthogonal complement of $R^m$ (by the fundamental theorem of linear algebra). They form an orthogonal basis for $R^m$.
Therefore, vector $\pmb{b} \in R^m$ can be one of three things:
- In the column space of $A$, $\pmb{b_c} \in C(A)$.
- In the left-nullspace of $A$, $\pmb{b_n} \in N(A^T)$
- In neither, and therefore fully described as a linear combination of vectors in the column space and left nullspace, $\pmb{b} = c_1P_c\pmb{b} + c_2P_n\pmb{b}$ where $P_c$ and $P_n$ are projection matrices that project $\pmb{b}$ into the column and left-nullspace respectively.
The rowspace $C(A^T)$ and nullspace $N(A)$ form an orthogonal complement of $R^n$. Similar to above, a vector $\pmb{x}$ can either be in the rowspace, $\pmb{x_r} \in C(A^T)$, the nullspace $\pmb{x_n} \in N(A)$, or in neither can be described as a linear combination of its projection into the rowspace and nullspace. The proof in the above thread shows $A$ projects $\pmb{x_r}$ one-to-one (but not necessarily onto) the column space of $A$.
I think this is where the confusion arises in the other thread. The proof discussed in that thread is true for any $m$ x $n$ matrix. Specifically A$\pmb{x_r}$ projects 1:1 to the column space of $A$, but there may be $\pmb{x} \notin C(A^T)$ and their transformation by $A$ are also forced into the column space. Therefore $A\pmb{x_r} \rightarrow \pmb{b_c}, \ \pmb{b_c} \in C(A)$ is injective but $A\pmb{x} \rightarrow \pmb{b_c}, \ \pmb{b_c} \in C(A)$ is surjective. The first answer in the linked thread fixes this such that $\pmb{x} = \pmb{x_r}$ by specifying $A$ be full rank, but it does not actually have to be-just that the claim is only true for $\pmb{x}$ in the rowspace.
The three possibilities are:
- $\pmb{x}$ is in the rowspace and is projected 1:1 onto the column space
- $\pmb{x}$ is in the nullspace and projects to the zero vector
- $\pmb{x}$ is in nether the rowspace nor columnspace, but is projected onto the columnspace of $A$, as show below:
$A\pmb{x} = \pmb{b}$
$A(P_r \pmb{x} + P_n \pmb{x}) = \pmb{b}$
Where $P_r$ and $P_n$ is a projection onto the rowspace and nullspace of $A$ respectively.
$A P_r \pmb{x} + 0 = \pmb{b}$
Therefore when $\pmb{x}$ is not in the rowspace, $\pmb{b}$ is transformation of its rowspace projection into the columnspace.
Therefore to summarise:
- $T_A|_{C(A^T)}$ is an injective function into the columnspace of $A$.
- $T_A$ is a surjective function onto the columnspace of $A$.
Is this all correct or have I misunderstood something?
[EDIT]
I made a crucial error in the original question above and incorrectly used the term 'surjective' (i.e. onto) when I meant 'many to one'. To add as an additional question, I was asking:
- is the function $T_a|_{C(A^T)}$ injective?
- if we extend the domain to all $\pmb{x} \in R^n$, $A$ will still only project a vector into $C(A)$. Any $\pmb{x}$ that is not in the rowspace or nullspace will have its projection onto the rowspace transformed to the column space, $A P_r \pmb{x} \rightarrow C(A)$. Therefore $T_A$ is many-to-one (but it is not surjective, unless we restrict the codomain to the range)?
Let $T|_{R(A)}$ denote the function $T_A$ restricted to $R(A)$. We show that $T|_{R(A)}$ is injective. Since the orthogonal complement of $R(A)$ is $N(A)$, we see that $\ker T|_{R(A)} = \ker T_a \cap R(A)= \{0\}$. If the kernel of a linear transformation $L$ is trivial, then the function is injective. Try to show this yourself.
If we restrict the codomain of $T|_{R(A)}$ to its range, i.e. $C(A)$, then $T|_{R(A)}$ is also surjective (this is just a general fact: if you restrict the codomain of a function to its range, then you end up with a surjective function). Hence, if you restrict both the domain and codomain of $T_A$, we see that $T|_{R(A)}$ is an isomorphism between $R(A)$ and $C(A)$.
There is no need for the matrix $A$ to be full rank in order for the proof in the linked post to hold.
Edit: original post was edited. 1. The function is injective as before. 2. $T_A$ can be invertible, so it is not necessarily many-to-one.