Derivative w.r.t. matrix

135 Views Asked by At

Suppose that we have a function $f(A) = Ab$ for $A \in \mathbb{R}^{m\times n}$ and $b \in \mathbb{R}^{n}$. Is this correct? $$\frac{df}{dA} = \begin{bmatrix} b^T\\ \vdots \\ b^T \end{bmatrix}$$

2

There are 2 best solutions below

0
On

In your example, $f$ is a $m$-entries vector. Let's say that $f_i$ is its $i$-th entry. In particular, we can state that:

$$f_i : \mathbb{R}^{n,m} \to \mathbb{R}.$$

In other words, $f_i(A)$ is a function of a matrix $A$, returning a real number.

The derivative of $f_i(A)$ with respect to $A$ is a matrix $C_i = \{c_{i,j,k}\}$ ($j$ is the row index, $k$ is the column index), such that:

$$c_{i,j,k} = \frac{\partial f_i}{\partial a_{j,k}}.$$

In your example:

$$c_{i,j,k} = b_k.$$

At the end of the story, the derivative of the whole vector $f$ with respect to to $A$ is a 3D-tensor (i.e. a matrix 3 dimensions), which entries are exactly $c_{i,j,k}.$

0
On

As a start, adding on to @the_candyman's post...

By the definition of the derivative, we should have that $\mathcal{D}f$ is a linear map taking $\mathbb{R}^{m\times n} \to \mathbb{R}^m$ which satisfies: \begin{align} \lim_{H\to 0_{m\times n}} \frac{\|f(A+H) - f(A)- \mathcal{D}f \cdot H\|}{\|H\|} \to 0 \end{align}

From above, we see that \begin{align} \frac{\|f(A+H) - f(A)- \mathcal{D}f \cdot H\|}{\|H\|} = \frac{\|(A+H)b - Ab\|}{\|H\|} = \frac{\|Hb - \mathcal{D}f \cdot H\|}{\|H\|} \end{align} so $\mathcal{D}f$ must be composed of $x$ in the "right" way in order for the numerator to tend to zero.