I need to get the Derivative of Expectation of Gaussian w.r.t. mean and covariance

322 Views Asked by At

Say, I have $n$-dimensional multivariate Gaussian, $G(x:\mu, \Sigma)$.
($\mu$ is $n$ dimension vector and $\Sigma$ is $n\times n$ matrix.)

Say there is a goal $n$-dimensional vector $a$.

I need to bring and modify multivariate Gaussian close to vector $a$ as much as possible, by updating parameter $\mu$ and $\Sigma$ in $n$-dimensional space.
In other word, I need to minimize $Cost$ function defined as below.

$Cost=E_{x \sim G(x:\mu, \Sigma)}[abs(X-a)]$

Or maybe I should define $Cost$ as squared below.

$Cost=E_{x \sim G(x:\mu, \Sigma)}[(X-a)^2]$

Either way, I need to calculate the derivative of $Cost$ function, which is derivative of expectation w.r.t. $\mu$ and $\Sigma$ as below

$\frac{\partial Cost}{\partial \mu}=?$
$\frac{\partial Cost}{\partial \Sigma}=?$
How can we calculate that?
Thank you.

2

There are 2 best solutions below

0
On

$X-a$ is also a Gaussian vector with mean $\mu - a$ and matrix variance $\Sigma$

So $$\mathbb E \left[\left\|X-a\right\|^2\right] = \left\| \mathbb E \left[X-a\right]\right\|^2 + \text{Var}\left[\left\|X-a\right\|^2\right] = \left\|\mu - a\right\|^2 + \text{Tr}(\Sigma)$$

Now you can differentiate and obtain :

$$\frac\partial{\partial \mu} \mathbb E \left[\left\|X-a\right\|^2\right] = 2(\mu - a)$$

and

$$\frac\partial{\partial \Sigma} \mathbb E \left[\left\|X-a\right\|^2\right] = \text{Tr}(\Sigma) I$$

0
On

I'd go with the squared error as you can differentiate it. We can compute the Cost function by expanding out the expectations, i.e. $E[X^2] - 2aE[X]+a^2$. Then we know by properties of the Gaussian r.v. what these are in terms of $\mu, \sigma^2$. You can find the derivative of the cost function with respect to a vector, it is the matrix who's rows correspond to differentiating the $i^{th}$ component of the cost function with respect to each of the vectors components, i.e.

The matrix $(\frac{\partial Cost}{\partial \mu})_{ij} = \frac{\partial Cost_i}{\partial \mu_j}$.

You would then set this equal to zero along with the corresponding $\sigma^2$ derivative.