Derivative of multivariate Gaussian probability density function

58 Views Asked by Bumbble Comm At 27 Mar 2026 - 2:57

Question: We have $$ \begin{bmatrix} s^1 \\ \vdots \\ s^k \end{bmatrix} \in\mathbb{R}^{kL},\quad \begin{bmatrix} \mu^1 \\ \vdots \\ \mu^k \end{bmatrix} \in\mathbb{R}^{kL},\quad \Sigma^k\in\mathbb{R}^{kL\times kL} $$ where $s^t\in\mathbb{R}^L$, $\mu^t\in\mathbb{R}^L$, and $t\in\{1,\dots,k\}$. We have the probability density function denoted by $$ \mathcal{N} \left( \begin{bmatrix} s^1 \\ \vdots \\ s^k \end{bmatrix}; \begin{bmatrix} \mu^1 \\ \vdots \\ \mu^k \end{bmatrix}, \Sigma^k \right) =\frac{1}{\sqrt{\text{det}(2\pi\Sigma^k)}} \exp\left(-\frac{1}{2} \left( \begin{bmatrix} s^1 \\ \vdots \\ s^k \end{bmatrix} - \begin{bmatrix} \mu^1 \\ \vdots \\ \mu^k \end{bmatrix} \right)^\top (\Sigma^k)^{-1} \left( \begin{bmatrix} s^1 \\ \vdots \\ s^k \end{bmatrix} - \begin{bmatrix} \mu^1 \\ \vdots \\ \mu^k \end{bmatrix} \right) \right) . $$ What is the derivative (Jacobian) of $$ \nabla_{s^t} \mathcal{N} \left( \begin{bmatrix} s^1 \\ \vdots \\ s^k \end{bmatrix}; \begin{bmatrix} \mu^1 \\ \vdots \\ \mu^k \end{bmatrix}, \Sigma^k \right), $$ where $t\in\{1,\dots,k\}$. Thanks.

Attempt: I am able to derive this for the simpler case of just $\mathcal{N}(s;\mu,\Sigma)$, where $s\in\mathbb{R}^L$, $\mu\in\mathbb{R}^L$, and $\Sigma\in\mathbb{R}^{L\times L}$, where $$ \mathcal{N}(s;\mu,\Sigma) =\frac{\exp(-\frac{1}{2}(s-\mu)^\top\Sigma^{-1}(s-\mu))}{\sqrt{\text{det}(2\pi\Sigma)}}. $$ For this simple case, we have $$ \nabla_{s}\mathcal{N}(s;\mu,\Sigma) =\Sigma^{-1}(\mu-s)\mathcal{N}(s;\mu,\Sigma). $$ How do I do it for the more complicated case shown above?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 06 Nov 2023 - 10:02 BEST ANSWER

Define the notation $$\vec{s} = \begin{bmatrix}s^1\\\vdots\\s^k\end{bmatrix}\qquad \vec{\mu} = \begin{bmatrix}\mu^1\\\vdots\\s^k\end{bmatrix} \qquad \vec{\Sigma} = \begin{bmatrix} \Sigma^{1,1} &\dots &\Sigma^{1,k}\\ \vdots & \ddots &\vdots\\\Sigma^{k,1} & \dots & \Sigma^{k,k}\end{bmatrix}$$ The gradient with respect to $[s^1, \dots, s^k]$ of the complicated case is exactly equal to the simple case, $$\nabla_{s^1, \dots, s^k}\mathcal{N}(\vec{s}; \vec{\mu}, \vec{\Sigma}) = \vec{\Sigma}^{-1}(\vec{\mu} - \vec{s}) \mathcal{N}(\vec{s}; \vec{\mu}, \vec{\Sigma})$$ This is a vector with dimension $kL$. The component in direction $s^t$ is obtained by looking at the entries for $s^t$, $$ \nabla_{s^t}\mathcal{N}(\vec{s}; \vec{\mu}, \vec{\Sigma}) = \mathcal{N}(\vec{s}; \vec{\mu}, \vec{\Sigma}) \cdot \sum_{i = 1}^k \Pi^{t,i}(\mu^i - s^i)$$ where $\Pi^{i,j}$ is notation for the block matrix $$ \vec{\Sigma}^{-1} = \begin{bmatrix}\Pi^{1,1} &\dots &\Pi^{1,k}\\ \vdots & \ddots &\vdots\\\Pi^{k,1} & \dots & \Pi^{k,k}\end{bmatrix}.$$

Derivative of multivariate Gaussian probability density function

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in MATRICES

Related Questions in DERIVATIVES

Related Questions in PARTIAL-DERIVATIVE

Related Questions in JACOBIAN

Trending Questions

Popular # Hahtags

Popular Questions