Derivative of diagonal function

825 Views Asked by At

I'm working on a sightly modified least-squares method which must minimize the quantity: $$ [Y-\text{diag}(\mu X^T)]^T\cdot [Y-\text{diag}(\mu X^T)] $$ where $Y$ is a $n$-dimensional vector and $\mu$ and $X$ are $n$ by $n$ matrices.

I'm having problems calculating the derivatives with relation to $\mu$ explicitely in order to obtain the normal equations. How can I do that?

1

There are 1 best solutions below

0
On

Let $w = [Y-\text{diag}(\mu X^T)]$, so that the quantity to be differentiated is $w^2$.

Also, let's denote the operation for turning a vector into a diagonal matrix as $W = \text{Diag}(w)$.

Then the derivative that you requested is: $$ \frac {\partial w^2} {\partial\mu} = -2 W \cdot X $$

In order to derive it, I'll use the handy $3^{rd}$ order hyper-diagonal tensor $\beta$, whose components are all zero except when $i\!=\!j\!=k$ then $\beta_{ijk} = 1$.

This tensor allows the diag/Diag operators to be expressed as $$ \eqalign { w &= \text{diag}(W) = \beta:W \cr W &= \text{Diag}(w) = \beta\cdot w \cr }$$

Okay, here we go $$ \eqalign{ dw^2 &= 2 w\cdot dw \cr &= 2 w\cdot d[Y-\text{diag}(\mu X^T)] \cr &= 2 w\cdot d[Y-\beta:(\mu X^T)] \cr &= -2 w\cdot \beta:d(\mu X^T) \cr &= -2 W:d(\mu X^T) \cr &= -2 W:d\mu\cdot X^T \cr &= -2 W\cdot X:d\mu \cr }$$ Finally, going from the differential to the derivative yields: $$ \eqalign{ \frac {\partial w^2} {\partial\mu} &= -2 W \cdot X \cr }$$