Differentiating the Norm

128 Views Asked by At

Let $RSS=<y-X\theta,y-X\theta>$.

Then differentiating RSS w.r.t $\theta$ gives: $$2<-X,y-X\theta>.$$

It could be proved that it is true. But I want to know the general differential rules related to norm of vectors. I've been searching this, but I couldn't find it. I know the definition of vector derivatives. I want to know the differential rule similar as in 1D case.

1

There are 1 best solutions below

0
On

The Frobenius product is a concise notation for the trace, i.e. $\;A:B={\rm Tr}(A^TB)$.
It is equivalent to bracket notation $\big(\,\langle x,y\rangle = x:y\,\big)\,$ but easier to type.

Define the vector $$w=X\theta-y$$ Write the RSS function in terms of this new variable.
Then calculate its differential and gradient. $$\eqalign{ R &= (-w):(-w) \;=\; w:w \\ dR &= dw:w + w:dw \;=\; 2w:dw \;=\; 2w:X\,d\theta \\ &=\; 2X^Tw:d\theta \\ \frac{\partial R}{\partial \theta} &= 2X^Tw \;=\; 2X^T(X\theta-y) \\ }$$ NB: The properties of the trace allow the Frobenius product to be rearranged in many ways. $$\eqalign{ A:B &= B:A \\ A:B &= A^T:B^T \\ A:BC &= B^TA:C \;=\; AC^T:B \;=\; \ldots \\ }$$