Unique solution to normal equation proof

913 Views Asked by At

So I'm trying to prove the following:

Show that a unique solution for linear regression exists i the features are not linearly dependent. Namely,show that a unique solution $argmin_{\theta} (X^{T}X)^{-1}X^{T}y$ exists iff $X$ has full column rank.

What I did was:

Theorem: For every matrix $A\in R^{nxm}$ we have $Null(A) = Null(A^{T}A)$ and $Rank(A) = Rank(A^{T}A)$

then I said: if $(X^{T}X)^{-1}X^{T}$ is a solution $\rightarrow$ $(X^{T}X)^{-1}$ is invertible $\rightarrow$ $Null(A^{T}A)=\{0\}$ $\rightarrow$ $Null(A) = \{0\}$ $\rightarrow$ $A$ is full rank

But I'm not sure this is the right way, also the opposite direction seems identical(reversed) to the one I wrote.

Would appreciate some help

Edit

So I derived the following expression for the normal equation through the derivation of the MSE loss:

$(X^{T}X)\theta = X^{T}y$

So $\theta$ is the parameter I used to derive with, and $X,y$ are the "training set"

So I want to prove now that the solution for $\theta$ that minimize the loss is unique given $X$ is full rank.

1

There are 1 best solutions below

1
On BEST ANSWER

You have the right basic ideas, but the way you write things doesn't quite make sense. I would argue that the following statements are all equivalent:

  • There is a unique solution to $\text{argmin}_\theta\|X\theta - y\|$,
  • There is a unique solution to $X^TX \theta = X^Ty$,
  • $X^TX$ is invertible,
  • $X^TX$ has full rank,
  • $X$ has full column rank.

Indeed, these steps make sense in both the forward and backward directions.