Differentiation with respect to matrix

1.5k Views Asked by At

I have matrices $W$ and $X$ of dimensions $h\times d$ and $d\times1$ respectively. I want to calculate the partial derivative of $WX$ with respect to $W$. Will that be $X$?

2

There are 2 best solutions below

0
On BEST ANSWER

Let's write your function using index notation $$y_i = W_{ij} x_j$$ Before we begin, we need the gradient of a matrix with respect to itself $$\frac{\partial W_{ij}}{\partial W_{km}} = \delta_{ik}\,\delta_{jm}$$

Now we can find the differential and then the gradient of your function $$\eqalign{ dy_i &= dW_{ij} x_j \cr \frac{\partial y_i}{\partial W_{km}} &= \delta_{ik}\,\delta_{jm}\,x_j = \delta_{ik}\,x_m \cr }$$ Note that the result is a 3rd order tensor.

1
On

The definition of "partial derivative of [...] with respect to a matrix" is unclear to me. What is clearly defined is the notion of directional derivative.

Let $X \in \mathbb{R}^d$ and $f \, : \, \mathrm{Mat}\big( (h,d), \mathbb{R} \big) \, \rightarrow \, \mathbb{R}^h$ such that:

$$\forall W \in \mathrm{Mat}\big( (h,d), \mathbb{R} \big), \; f(W) = WX.$$

By definition, the directional derivative of $f$ at point $W \in \mathrm{Mat}\big( (h,d), \mathbb{R} \big)$ in the direction of the vector $V \in \mathrm{Mat}\big( (h,d), \mathbb{R} \big)$ is defined by:

$$ \lim \limits_{t \to 0} \frac{ f(W + tV) - f(W) }{t}. $$

The value of this limit is usually denoted by $\mathrm{D}_{W} f \cdot V$ or $df(W)(V)$ or even $(\partial f / \partial W)(V)$. Therefore, we could say that:

$$ \frac{\partial f}{\partial W}(V) = VX. $$