Naive question re. normal equation for linear regression

47 Views Asked by Bumbble Comm At 27 Mar 2026 - 4:20

The typical normal equation for linear regression is $\theta=(X^TX)^{−1} X^T Y$ such that the gradient of $J(\theta)$ is zero. Why does $X^{-1} Y$ not work? What are the numerical reasons for this?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 24 Mar 2014 - 9:14

$X$ might not be invertible. It might not even be square for that matter. The normal equation works if $X$ is non-invertible, and if $X$ is invertible: $(X^TX)^{-1}X^TY = X^{-1}X^{-T}X^TY = X^{-1}IY = X^{-1}Y$

Bumbble Comm On 24 Mar 2014 - 9:14

Please count the dimensions. $X$ is, by the nature of regression, a matrix that has much more rows than columns, there is no inverse for general rectangular matrices.

You can use a QR decomposition of $X$, then $\|Xθ-Y\|=\|QRθ-Y\|=\|Rθ-Q^TY\|$, and the last form can be trivially minimized by solving the triangular system at the top and disregarding all the zero rows of $R$.

Naive question re. normal equation for linear regression

There are 2 best solutions below

Related Questions in REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions