Can we rewrite the solution to the linear regression in other form?

73 Views Asked by At

I know the solution to the linear regression function is $\beta=(X^TX)^{-1}X^TY$ and I want to know whether it can be rewritten as $X^T(XX^T)^{-1}Y$ or not and the reason. Thanks for your help.

2

There are 2 best solutions below

2
On BEST ANSWER

You are probably referring to two types of the same linear regression problem:

  1. underdetermined when there are fewer observations than parameters. That is $X\beta = y$ where X is a rectangular matrix with shape $m \times n$ where $n \geq m$
  2. overdetermined when there are more observations than parameters. X is in this case a rectangular matrix with shape $m \times n$ where $n \leq m$.

for Problem of type 1. (underdetermined) there are many possible solutions to the problem (there is no unique solution).

One can decide to find among these infinite solutions, a solution with lowest squared Euclidean norm and then would write the following optimization problem: \begin{equation} \begin{array}{rrclcl} \displaystyle \min_{\beta} & {||\beta||_{2}^{2}} \\ \textrm{s.t.} & X \beta & = & y \\ \end{array} \end{equation}

Writing the Lagrangian $L(\beta,\mu) = ||\beta||_{2}^{2} + \mu^{T}(y - X \beta)$

Taking the partial derivative with respect to $\beta$ and setting it to $0$: $\frac{\partial L(\beta,\mu)}{\partial \beta} = 2\beta - X^{T}\mu = 0$

Taking the partial derivative with respect to $\mu$ and setting it to $0$: $\frac{\partial L(\beta,\mu)}{\partial \mu} = y - X\beta = 0$

substituting, and assuming that $XX^{T}$ is invertible we obtain $\mu = 2(XX^{T})^{-1}y$. Plugging back into $2\beta = X^{T}\mu$ we obtain the solution $\beta = X^{T}(XX^{T})^{-1}y$ (1)

For Problem of type 2. (overdetermined) the problem does not have a solution (is inconsistent). Thus we can only hope to find a solution that is a good approximation in the sense that it is minimizing the energy of the error $J(\beta) = ||y-X\beta||_{2}^{2}$

We can formulate in this case the unconstrained optimization problem: \begin{equation} \begin{array}{rrclcl} \displaystyle \min_{\beta} & {||y-X\beta||_{2}^{2}} \\ \end{array} \end{equation}

Taking the derivative with respect to $\beta$ and setting it to $0$: $\frac{\partial ||y-X\beta||_{2}^{2}}{\partial \beta} = -2X^{T}y + 2X^{T}X\beta = 0$ we get $X^{T}X\beta = X^{T}y$

The final solution (assuming $X^{T}X$ is invertible) : $\beta = (X^{T}X)^{-1}X^{T}y$ (2)

You can now see that the two solutions (1) and (2) are answers to a regression problem, but with different setups. Thus they are not interchangeable (with the single exception when both $X^{T}X$ and $XX^{T}$ are invertible )

0
On

If both $X^TX$ and $XX^T$ are invertible, then $$ (X^TX)^{−1}X^TY = X^{-1}(X^T)^{-1}X^TY= X^{-1}Y, $$ and, $$ X^T(XX^T)^{-1}Y = X^T (X^T)^{-1}X^{-1}Y=X^{-1}Y, $$ so in this case $\beta$ can be written in the two forms you mentioned. But, if one of $X^TX$ and $XX^T$ fails to be invertible (which in particular is the case when $X$ is not a square matrix), then the answer is no.