How do I take the derivative of a matrix equation?

496 Views Asked by At

Given an equation

$L = (y-Xw)'(y-Xw)$ , how can I determine the derivative of $L$ wrt to $w$?

y, X are matrices and A' represents the transpose of A.

This is a reference to the slide: enter image description here

2

There are 2 best solutions below

0
On BEST ANSWER

$L = (y-Xw)'(y-Xw)=(y’-w’X’)(y-Xw)=y’y-w’X’y-y’Xw+w’X’Xw$

$$dL=-dw’X’y-y’Xdw+dw’X’Xw+w’X’Xdw=-dw’X’y-dw’X’y+dw’X’Xw+dw’X’Xw$$ $$dL=dw’(-2X’y+2X’Xw)$$

I used the total derivative first. Then I used that the terms are scalars, hence transposing them doesn’t change them. In the last equation the bracket at the right is the gradient.

0
On

Hint: Look at the sum on the left side of the equation.

  1. Try to differentiate the sum using the chain rule
  2. If you have problems differentiating $x_i^Tw$ then write it out as a sum as well.