Differentiation of a scalar w.r.t. a matrix when both are functions of same vector

46 Views Asked by At

There is a scalar $s$ defined

$$s=v^Tx$$

Where the elements of vector $v$ are defined

$$v_i = \frac{\partial f}{\partial x_i}$$

Assume these first derivatives are not constants, so that each element $v_i$ is a function of the element $x_i$ (but independent of the other elements $x_{k\neq i}$).

And there is a (covariance) matrix $\Sigma$ each of whose $[i,j]$ elements are functions of $x_i$ and $x_j$ (but independent of the other elements $x_{k\neq i,j}$).

$$\sigma_{ij} = g(x_i, x_j)$$

What is $\frac{\partial s}{\partial \Sigma}$?

I'm thinking it is not zero because of the chain rule. This is what I came up with:

$$\left[\frac{\partial s}{\partial \Sigma}\right]_{ij}=\frac{\partial s}{\partial \sigma_{ij}}=\frac{\partial s}{\partial x_i}\frac{\partial x_i}{\partial \sigma_{ij}}+\frac{\partial s}{\partial x_j}\frac{\partial x_j}{\partial \sigma_{ij}}$$

Is that right?

EDIT:

Oh, found this (133 in the matrix cookbook):

$$\frac{\partial s}{\partial \sigma_{ij}}=tr \left(\frac{\partial s}{\partial \Sigma}^T\frac{\partial \Sigma}{\partial \sigma_{ij}} \right)$$

Not sure if it's the answer, but helps.