Does $A^TA$ always result in a square matrix?

1.2k Views Asked by At

And does the resulting $A^TA$ matrix always have an inverse to solve for $\vec{w}$ in

$$A^TA\vec{w}=A^T\vec{t}$$ ?

3

There are 3 best solutions below

0
On BEST ANSWER

As I said in the comments, $A^\top A$ is always square (Shogun covers this in more detail in their answer). However, $A^\top A$ may not have an inverse, particularly if the rank of $A$ does not match the number of columns of $A$, or equivalently, the columns are linearly dependent (which will always happen when $A$ has more columns than rows).

To answer the follow-up question in the comments (properly), you solve the matrix equation $A^\top A \vec{w} = A^\top \vec{t}$. This is done with Gauss-Jordan elimination.

Let's do an example to illustrate this. Take $$A = \begin{pmatrix} 1 & 2 & -1 \\ 0 & -1 & 1 \\ 2 & -1 & 3 \\ -1 & 1 & -2 \end{pmatrix}.$$ First, note that the second column plus the first column is the first column, and hence the columns are linearly dependent. I therefore expect $A^\top A%$ to be a rank two $3 \times 3$ matrix, which will make it singular. Computing, $$A^\top A = \begin{pmatrix}1 & 0 & 2 & -1 \\ 2 & -1 & -1 & 1 \\ -1 & 1 & 3 & -2 \end{pmatrix}\begin{pmatrix} 1 & 2 & -1 \\ 0 & -1 & 1 \\ 2 & -1 & 3 \\ -1 & 1 & -2 \end{pmatrix} = \begin{pmatrix} 6 & -1 & 7 \\ -1 & 7 & -8 \\ 7 & -8 & 15 \end{pmatrix}.$$ This matrix is indeed singular, as (again, not coincidentally) the second and third columns add to give the first column (or compute the determinant).

We use this when taking a vector $\vec{t} \in \Bbb{R}^4$ and finding a vector $\vec{y}$ in the columnspace of $A$ that is closest (in the usual Euclidean $2$-norm) to $\vec{t}$. Let's take $$\vec{t} = \begin{pmatrix} 1 \\ 1 \\ 1 \\ 1 \end{pmatrix}.$$ We can see that $\vec{t}$ does not already belong to this columnspace by solving $A \vec{x} = \vec{t}$ and obtaining no solution. In this case, we would solve $$\begin{pmatrix} 1 & 2 & -1 \\ 0 & -1 & 1 \\ 2 & -1 & 3 \\ -1 & 1 & -2 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 1 \\ 1 \\ 1 \\ 1 \end{pmatrix},$$ which as an augmented matrix comes to \begin{align*} \left(\begin{array}{ccc|c} 1 & 2 & -1 & 1 \\ 0 & -1 & 1 & 1 \\ 2 & -1 & 3 & 1 \\ -1 & 1 & -2 & 1 \end{array}\right) &\sim \left(\begin{array}{ccc|c} 1 & 2 & -1 & 1 \\ 0 & -1 & 1 & 1 \\ 0 & -5 & 5 & -1 \\ 0 & 3 & -3 & 2 \end{array}\right) \\ &\sim \left(\begin{array}{ccc|c} 1 & 2 & -1 & 1 \\ 0 & 1 & -1 & -1 \\ 0 & 0 & 0 & -6 \\ \color{red}0 & \color{red}0 & \color{red}0 & \color{red}5 \end{array}\right). \end{align*} The system is inconsistent, so indeed, $\vec{t}$ is not in the columnspace of $A$. However, we can multiply $A^\top$ to $\vec{t}$ to get $$\begin{pmatrix}1 & 0 & 2 & -1 \\ 2 & -1 & -1 & 1 \\ -1 & 1 & 3 & -2 \end{pmatrix}\begin{pmatrix} 1 \\ 1 \\ 1 \\ 1 \end{pmatrix} = \begin{pmatrix} 2 \\ 1 \\ 1 \end{pmatrix}.$$ Our normal equation becomes $$\begin{pmatrix} 6 & -1 & 7 \\ -1 & 7 & -8 \\ 7 & -8 & 15 \end{pmatrix}\begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 2 \\ 1 \\ 1 \end{pmatrix},$$ or as an augmented matrix, \begin{align*} \left(\begin{array}{ccc|c} 6 & -1 & 7 & 2 \\ -1 & 7 & -8 & 1 \\ 7 & -8 & 15 & 1 \end{array}\right) &\sim \left(\begin{array}{ccc|c} 1 & -7 & 8 & -1 \\ 6 & -1 & 7 & 2 \\ 7 & -8 & 15 & 1 \end{array}\right) \\ &\sim \left(\begin{array}{ccc|c} 1 & -7 & 8 & -1 \\ 0 & 41 & -41 & 8 \\ 0 & 41 & -41 & 8 \end{array}\right) \\ &\sim \left(\begin{array}{ccc|c} 1 & -7 & 8 & -1 \\ 0 & 1 & -1 & \frac{8}{41} \\ 0 & 0 & 0 & 0 \end{array}\right) \\ &\sim \left(\begin{array}{ccc|c} 1 & 0 & \color{red}1 & \frac{15}{41} \\ 0 & 1 & \color{red}{-1} & \frac{8}{41} \\ 0 & 0 & \color{red}0 & 0 \end{array}\right) \end{align*} Note that, as we always expect from the normal equation, we have consistency; there is no zero row with a non-zero augmented number. That is, we have a solution. Since $A^\top A$ is not invertible, we have a column without a pivot, which gives us a free variable. Since it occurs in the third column, $x_3$ is a good candidate to assign as a free variable.

Let $t = x_3$. We have from the first row, $x_1 + x_3 = \frac{15}{41}$, so $x_1 = -t + \frac{15}{41}$. From the second row, $x_2 - x_3 = \frac{8}{41}$, so $x_2 = t + \frac{8}{41}$.

Now, the actual point $\vec{y}$ that is closest to $\vec{t}$ from the columnspace of $A$ is given by $A \vec{x}$, where $\vec{x}$ is one of the above solution. It doesn't matter which; just pick any value of $t$ you want ($t = 0$ is usually easy). However, I'm going to leave $t$ arbitrary just to demonstrate how it doesn't matter. We have \begin{align*} \vec{y} &= A \vec{x} = \begin{pmatrix} 1 & 2 & -1 \\ 0 & -1 & 1 \\ 2 & -1 & 3 \\ -1 & 1 & -2 \end{pmatrix}\begin{pmatrix} -t + \frac{15}{41} \\ t + \frac{8}{41} \\ t \end{pmatrix} \\ &= \begin{pmatrix} 1(-t + \frac{15}{41}) + 2(t + \frac{8}{41}) - 1t \\ 0(-t + \frac{15}{41}) + -1(t + \frac{8}{41}) + 1t \\ 2(-t + \frac{15}{41}) - 1(t + \frac{8}{41}) + 3t \\ -1(-t + \frac{15}{41}) + 1(t + \frac{8}{41}) - 2t \end{pmatrix} \\ &= \begin{pmatrix} \frac{31}{41} \\ -\frac{8}{41} \\ \frac{22}{41} \\ -\frac{7}{41} \end{pmatrix}. \end{align*} That is, the result of the least squares doesn't depend on $t$, i.e. it doesn't depend on your choice of vector from the normal equation. Solving via Gauss-Jordan elimination, then arbitrarily picking a solution, will obtain you to the same unique minimiser for the least squares problem.

0
On

Yes it is always square.

Consider $A $ be a $ m\times n$ matrix. The dimensions of $A^T $ will be $ n \times m$

Hence $A^TA$ will have dimensions $ n \times n$ which is clearly a square matrix.

However in some cases when $A=0$ we cannot find a inverse.

0
On

It's always square.If $A \in \mathbb{R}^{n \times m}$ then $A^{T} \in \mathbb{R}^{m \times n}$ so $A^{T}A \in \mathbb{R}^{m \times m}$

The inverse will only exist if $A$ is full rank since $\textrm{Rank}(A) = \textrm{Rank}(A^{T}A)$

However the minimum norm solution to $\|Ax - b\|^{2}$ is given by the solution to $A^{T}Ax = A^{T}b$ which is called the normal equations. Not a good method of solving it directly though since $\kappa(A^{T}A) = \kappa(A)^{2}$