Derivative of dot-product involving a matrix function

94 Views Asked by At

I am struggling with the following derivative $$\frac{\partial }{\partial \mathbf{x}}(\mathbf{b}\cdot(\mathbf{A}(\mathbf{x})\,\mathbf{c}))$$ with $\mathbf{b}\in \mathbb{R}^n$, $\mathbf{c}\in \mathbb{R}^m$ constants, and $\mathbf{A}\in \mathbb{R}^{n\times m}$ function of $\mathbf{x}\in \mathbb{R}^l$. I know that $$\frac{\partial \mathbf{A}(\mathbf{x})}{\partial \mathbf{x}}=\mathbf{M}$$ with $\mathbf{M}$ a $3$D tensor, such that $$M_{n,m,l}=\frac{\partial A_{n,m}}{\partial x_l}$$ but I am not able to find the results of the original problem starting from this result. I am primarly having trouble in retrieving the correct dimensions. Can you give me a hint? Thanks

2

There are 2 best solutions below

0
On BEST ANSWER

$$\frac{\partial }{\partial \mathbf{x}}(\mathbf{b}\cdot(\mathbf{A}(\mathbf{x})\,\mathbf{c}))$$ with $\mathbf{b}\in \mathbb{R}^n$, $\mathbf{c}\in \mathbb{R}^m$ constants, and $\mathbf{A}\in \mathbb{R}^{n\times m}$ function of $\mathbf{x}\in \mathbb{R}^l$.

$$\mathbf{b}\cdot(\mathbf{A}(\mathbf{x})\,\mathbf{c})=\sum_{i,j}b_i A_{ij}c_j$$

$$\frac{\partial}{\partial x_k}[\mathbf{b}\cdot(\mathbf{A}(\mathbf{x})\,\mathbf{c})]=\sum_{i,j}b_i \frac{\partial A_{ij}}{\partial x_k}c_j=\sum_{i=1}^n\sum_{j=1}^mb_i~ M_{ijk}~c_{j}~~~~~~~~k=1,2,...,l$$

In vector form:

$$\frac{\partial }{\partial \mathbf{x}}(\mathbf{b}\cdot(\mathbf{A}(\mathbf{x})\,\mathbf{c}))=\left(\sum_{i=1}^n\sum_{j=1}^mb_i~ M_{ij1}~c_{j},~~\sum_{i=1}^n\sum_{j=1}^mb_i~ M_{ij2}~c_{j},~~...,\sum_{i=1}^n\sum_{j=1}^mb_i~ M_{ijl}~c_{j}\right)$$

0
On

As has been demonstrated by others, the derivative you are looking for is $$ \frac{\partial}{\partial\mathbf{x}}\left(\mathbf{b}\cdot(\mathbf{A}(\mathbf{x})\mathbf{c})\right) = \mathbf{c}\cdot(\mathbf{b}\cdot\mathbf{M}), $$ where $\mathbf{M} = \nabla \mathbf{A}$ is defined as in the OP, and I follow the convention that dot products on the left (resp. right) of $\mathbf{M}$ operate over the index in its first (resp. last) position.

Here I give a coordinate-free calculation based on differential forms which I find cleaner than using tensor indices (although it is useful to be able to handle such calculations in both ways). Indeed, we have $$ d\left(\mathbf{b}\cdot(\mathbf{A}(\mathbf{x})\mathbf{c})\right) = \mathbf{b}\cdot (d\mathbf{A}(\mathbf{x})\mathbf{c}) = \mathbf{bc}^\intercal: d\mathbf{A}(\mathbf{x}) = \left(\mathbf{c}\cdot(\mathbf{b}\cdot\mathbf{M})\right)\cdot d\mathbf{x},$$ where the final equality is due to $d\mathbf{A} = \mathbf{M}\cdot d\mathbf{x}$. The result then follows since $ df(\mathbf{x}) = \nabla f(\mathbf{x})\cdot d\mathbf{x}$.