Matrix expression simplification (linear regression)

36 Views Asked by At

$$ \left\lVert Y - X\beta \right \rVert^2 = (Y - X\beta)^T(Y - X\beta) $$ $$ = Y^TY - \beta^TX^TY - Y^TX\beta + \beta^TX^TX\beta $$ $$ = \left\lVert Y \right\rVert^2 - 2Y^TX\beta + \left\lVert X\beta \right\rVert^2 $$

I don't understand any of the three simplification steps here.

  1. In the first equation, how is this simplification done? Is this a known property?
  2. It seems like $(Y - X\beta)^T(Y - X\beta) = (Y^T - X^T\beta^T)(Y - X\beta)$, but not quite since the last term is $\beta^TX^TX\beta$, not $X^T\beta^TX\beta$ -- how is this step done?
  3. Is $\left\lVert X\beta \right\rVert^2 = \beta^TX^TX\beta$ a known property? How does $- \beta^TX^TY - Y^TX\beta = - 2Y^TX\beta$ ?

Any help will be much appreciated, thanks in advance!