Take derivative of matrix

173 Views Asked by At

A part of an objective function is:

$$F=\|H-\mu_H\|_F^2$$

And we have:

$$\mu_H=\frac{\Sigma H}{n_H}$$

In fact, $\mu_H$ is the average of $H$ in one dimension and is repeated $n$ times in which all columns are the same.

How can I take the derivative of $F$ with respect to $H$ i.e. $\frac{\partial F}{\partial H}$?

2

There are 2 best solutions below

4
On

Your function can be written as $F(H) = f(g(H))$, where $$ f(H) = \|H\|_F^2, \quad g(H) = H - \mu_H = H - \frac 1n Hee^T $$ where $e = (1,\dots,1)^T$. The total derivatives of these functions are given by $$ df(H)(K) = 2\operatorname{Tr}(H^TK), \quad dg(H)(K) = g(K). $$ With the chain rule, we have $$ d(f \circ g)(H)(K) = [df(H) \circ dg(H)](K)\\ = df(H)(g(K)) = df(H)\left(K - \frac 1n Kee^T\right)\\ = 2\operatorname{Tr}\left( H^T\left(K - \frac 1n Kee^T\right)\right)\\ = 2\operatorname{Tr}(H^TK) - \frac 2n\operatorname{Tr}((Hee^T)^TK)\\ = 2 \operatorname{Tr}\left( \left(H - \frac 1n Hee^T \right)^TK\right)\\ = 2 \operatorname{Tr}(g(H)^TK) $$ In "denominator layout", this derivative is given by $\frac{\partial F}{\partial H} = 2g(H)$.

0
On

The Centering Matrix is an idempotent matrix which can be defined in terms of the identity matrix $(I)$ and all-ones matrix $(J)$ as $$\eqalign{ C &= I - \tfrac 1nJ, \qquad C^2 &= C = C^T \\ }$$ Define the matrix variable $Y$ as $$\eqalign{ Y &= HC \;=\; (H-\mu_H) \\ YC &= HC^2 \;=\; HC = Y \qquad\qquad\quad \\ }$$ Rewriting the objective function using $Y$ makes it easier to differentiate $$\eqalign{ \def\tr{\operatorname{Tr}} \def\p{\partial} F &= \|Y\|^2_F \;=\; Y:Y \\ dF &= 2Y:dY = 2Y:(dH\,C) = 2(YC):dH = 2Y:dH \\ \frac{\p F}{\p H} &= 2Y \;=\; 2(H-\mu_H) \\\\ }$$


In the above, a colon is used as a convenient product notation for the trace, i.e. $$\eqalign{ A:B &= \tr(A^TB) \;=\; B:A \\ A:A &= \tr(A^TA) \;=\; \|A\|^2_F \\ }$$