How to get the derivative of a matrix function?

Question

How to get the derivative of a matrix function?

8.2k Views Asked by Bumbble Comm At 07 Apr 2026 - 4:50

I want to get the derivative of a matrix function as follow: $$\frac{\partial f(\boldsymbol{AX})}{\partial \boldsymbol{X}}$$ which $f(\cdot)$ is a scalar function, and the result as I think should be the same shape as the matrix $\boldsymbol{X}$

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2015-07-29 01:22:49

It is basically the same as with vectors. The chain rule yields the total derivative (for any matrix $H$, having the same size as $X$) $$ D_X (f(AX)) [H] = f'(AX)[AH] = \langle \nabla f(AX), AH \rangle = \langle A^T \nabla f(AX), H\rangle. $$ Thus, the gradient is $A^T \nabla f(AX)$. Here, the inner product is given by $\langle X,Y \rangle = \operatorname{trace}(X^TY)$, and the gradient is the matrix of partial derivatives ordered as $X$.

**Bumbble Comm** · Answer 2 · 2015-08-03 18:48:40

Let $Y\!=\!AX$, so that $f\!=\!f(Y)$. I assume that you know how to calculate the derivative $\frac{\partial f}{\partial Y}$ and now wish to calculate $\frac{\partial f}{\partial X}$.

So write down the differential in terms of the Frobenius product (:) and switch the independent variable from $Y$ to $X$. $$\eqalign{ df &= \frac{\partial f}{\partial Y} : dY \cr &= \frac{\partial f}{\partial Y} : (AdX) \cr &= (A^T\frac{\partial f}{\partial Y}) : dX \cr\cr \frac{\partial f}{\partial X} &= A^T\frac{\partial f}{\partial Y} \cr }$$

If you do not know how to calculate $\frac{\partial f}{\partial Y}$ and want help with that, then you'll need to give us more information about the function.

If you are uncomfortable with the Frobenius product, you can replace it with the trace function, $\,\,A\!:\!B = {\rm tr}(A^T\!B)$.

Update

When a scalar function ($f$) is applied element-wise to a matrix argument ($Y$), the differential can be expressed in terms of the Hadamard ($\circ$) product as $$ \eqalign { df &= f'\circ dY \cr } $$ We can use the single-entry matrix $E_{ij}$ and the Frobenius (:) product to isolate a single element $$ \eqalign { df_{ij} &= E_{ij}:df \cr &= E_{ij}:f'\circ dY \cr &= E_{ij}\circ f': dY \cr } $$ Finally, the sigmoid function mentioned in the comments is interesting because the derivative is $f'=(f-f^2)$, which allows us to write $$ \eqalign { df_{ij} &= E_{ij}\circ(f-f^2) : dY \cr } $$

Since $df_{ij}=(\frac{\partial f_{ij}} {\partial Y}:dY)$, the derivative of this element with respect to $Y$ is $$ \eqalign { \frac{\partial f_{ij}} {\partial Y} &= E_{ij}\circ(f-f^2) \cr } $$ and with respect to $X$ it's $$ \eqalign { \frac{\partial f_{ij}} {\partial X} &= A^T\,\frac{\partial f_{ij}} {\partial Y} \cr } $$

How to get the derivative of a matrix function?

There are 2 best solutions below

Related Questions in MATRIX-CALCULUS

Trending Questions

Popular # Hahtags

Popular Questions