Projection matrix formula intuition

511 Views Asked by Bumbble Comm At 28 Mar 2026 - 8:46

I completely understand how projection matrix formula: $$P = A(A^TA)^{-1}A^T$$is derived from: $$ A^T(b - A\hat{x} ) = 0$$ but what I don't understand is the "story proof" or the "intuition" of the first formula as a linear transformation to the column space of $A$, as it is supposed to be.

In fact I have three specific questions:

why would someone transform $b$ (as a vector to be projected onto $A$) into the "row space" of $A$ before it can be transformed into the column space of $A$?
what specific transformation does the matrix $A^TA$ encode?
why one should transform the vector to the "inverse of the tranformation $A^TA$" before it can be transformed to the column space of $A$?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 19 Mar 2021 - 12:19 BEST ANSWER

here is a beautiful one:

The strategy for finding the projection vector $A\hat{x}$ in the column space of $A$ is to find a vector $p$ that has the same dot products on the columns of $A$, as $b$.

So, first one should find the dot products of $b$ on each column of $A$ through the production of $A^Tb$

Then you want to find the linear combination of columns of $A$ that gives you the same dot products.

First one must find the coefficients of this linear combination. The columns of the matrix $A^TA$ are composed of dot product of each column on the other columns and also on itself. So, the matrix $A^TA$ translates the coefficients of the columns of $A$ to the dot products on each column of $A$. Thus, $(A^TA)^{-1}$ do the reverse. it takes the dot products on each vector and spits out the necessary coefficient of each column in the linear combination. thats exactly what we want.

Remember that by the production of $A^Tb$ we found the dot product of $b$ on each column of $A$. now we want to know which linear combination gives the same dot products. So, we multiply $A^Tb$ by $(A^TA)^{-1}$.

Now we have the coefficients of each column for the linear combination which has the same dot products on the columns of $A$ as $b$. we should simply multiply $(A^TA)^{-1}A^Tb$, which was derived in the previous paragraph, by $A$, because we have the coefficients, so multiply each coefficient to each corresponding column and add them up, thats exactly what $A(A^TA)^{-1}A^Tb$ does.

Bumbble Comm On 02 Jan 2021 - 9:46

First note that the column space $R(A)$ is being mapped by $P$ identically to itself. Indeed, for a vector $x$ in the domain, we have $\require{extpfeil}\Newextarrow{\xmapsto}{5,5}{0x27FC}$

$$Ax \,\xmapsto{A^T} A^TAx \,\xmapsto{(A^TA)^{-1}} x \,\xmapsto{A} Ax.$$

On the other hand, for a vector $y \in R(A)^\perp$ recall that $R(A)^\perp = N(A^T)$ so $$y\,\xmapsto{A^T} 0 \,\xmapsto{A(A^TA)^{-1}} 0.$$ Finally, the domain can be decomposed as $R(A) \oplus R(A)^\perp$ so with respect to this decomposition we have $P = I \oplus 0$, which means precisely that $P$ is an orthogonal projection onto $R(A)$.

Projection matrix formula intuition

There are 2 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in MATRICES

Related Questions in LINEAR-TRANSFORMATIONS

Related Questions in PROJECTION

Related Questions in PROJECTION-MATRICES

Trending Questions

Popular # Hahtags

Popular Questions