Prove $(X \theta - \vec{y})^T (X \theta - \vec{y}) = \theta^T X^T X \theta - \theta^T X^T \vec{y} - \vec{y}^T X \theta + \vec{y}^T \vec{y}$

336 Views Asked by user371331 At 27 Mar 2026 - 3:20

I'm studying Machine Learning Stanford's CS229 course and in the lecture note, page number 11, I'm not getting how does step 2 arrive from step 1 above?
Prof. Andrew Ng says that it is the expansion of quadratic $(X \theta - \vec{y})^T (X \theta - \vec{y})$ which is taken from derivation on page number 10.
Can anyone explain me how does the expansion of quadratic $(X \theta - \vec{y})^T (X \theta - \vec{y})$ is equal to $\theta^T X^T X \theta - \theta^T X^T \vec{y} - \vec{y}^T X \theta + \vec{y}^T \vec{y}$?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 18 Mar 2019 - 9:45 BEST ANSWER

The expression follows from the Distributive Law and the Transpose Rules of matrix algebra.

These are -

$(A+B)(C+D)= AC+AD+BC+BD$
$(AB)^T = B^T . A^T$ and $(A+B)^T = A^T + B^T$

The first term expands as -

$(X\theta -y)^T ={ \theta }^T X^T - y^T$

The rest is simple multiplication.

The proofs of the properties used -

The Distributive Law is an axiom.

For $(A+B)^T = A^T +B^T$ , consider the $(i,j)^{\text{th}}$ elements -

$ (A+B)^T = (a_{ij} +b_{ij})^T = (a_{ji}+b_{ji}) = (a_{ji}) +(b_{ji})= A^T+B^T$

For the product rule-

By definition-

$AB = ( \Sigma_{k=1}^{n} (a_{ik} b_{kj}) ) $ $(i,j)^{\text{th}}$ element

Now,

$(AB)^T = ( \Sigma_{k=1}^{n} (a_{ik} b_{kj}) ) $ $(j,i)^{\text{th}}$ element

$= ( \Sigma_{k=1}^{n} (a_{ki} b_{jk}) ) $ $(i,j)^{\text{th}}$ element

$= ( \Sigma_{k=1}^{n} (b_{jk} a_{ki}) ) $ $(i,j)^{\text{th}}$ element

$=B^T A^T$

The above uses the fact that-

If $A=(a_{ij})$, then $A^T = (a_{ji})$.

Bumbble Comm On 18 Mar 2019 - 9:42

Since $(X\theta-\vec{y})^T=\theta^TX^T-\vec{y}^T$, the product is $$(\theta^TX^T-\vec{y}^T)(X\theta-\vec{y})=\theta^T X^T X \theta - \theta^T X^T \vec{y} - \vec{y}^T X \theta + \vec{y}^T \vec{y}.$$

Prove $(X \theta - \vec{y})^T (X \theta - \vec{y}) = \theta^T X^T X \theta - \theta^T X^T \vec{y} - \vec{y}^T X \theta + \vec{y}^T \vec{y}$

There are 2 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in MATRICES

Related Questions in VECTORS

Related Questions in QUADRATICS

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions