Chain Rule: Derivative of Squared Mahalanobis Distance

535 Views Asked by At

I need to calculate the derivative of \begin{equation} d(\Pi',\Sigma^{-1})=\sum_{t=1}^{T}(Y_{t}-\Pi'X_{t})'\Sigma^{-1}(Y_{t}-\Pi'X_{t}) \end{equation} Where $y_{t}=\left[\begin{array}{c} y_{1t} \\ \vdots \\ y_{kt}\end{array}\right]$ $X_{t}=\left[\begin{array}{c} 1 \\ x_{1t} \\ ... \\ x_{pt} \\ \end{array}\right]$

with $t=1,...,T$ and \begin{equation} \Pi'=\left[\begin{array}{ccc} \mu & ... & \Phi_{1p} \\ \mu & ... & \Phi_{2k} \\ \vdots & \ddots & \vdots \\ \mu & ... & \Phi_{kp} \end{array}\right] \end{equation}

The derivative would like \begin{equation} \frac{\partial d(\Pi',\Sigma^{-1})}{\partial \Pi'}=2\sum_{t=1}^{T}\Sigma^{-1}(Y_{t}-\Pi'X_{t})\frac{\partial \Pi'X_{t}}{\partial \Pi'}\tag{*} \end{equation} What is the derivative of $\frac{\partial \Pi'X_{t}}{\partial \Pi'}$?

I know that \begin{equation} \frac{\partial \Pi'X_{t}}{\partial \Pi'}=\frac{\partial}{\partial \Pi'}\left[\begin{array}{c} \mu+\Phi_{11}X_{1t}+...+\Phi_{1p}X_{pt} \\ \vdots \\ \mu+\Phi_{1k}X_{1t}+...+\Phi_{kp}X_{pt} \end{array} \right]=\left[\begin{array}{c} 1 \\ X_{1t} \\ \vdots \\ X_{pt}\end{array}\right] \end{equation} With this result the dimensiones of the vectors in (*) do not match. I could transpose. But, I think the transposed must come naturally or not?

I have used the following proposition in the first factor of the previous derivative

Proposition

Let $\bf{x}$ a $n\times 1$ vector and $\bf{A}$ a $n\times n$ matrix such that $\bf{A'}=\bf{A}$. We define the function $q(\bf{x}): \mathbb{R}^{n}\rightarrow \mathbb{R}$ as \begin{equation} q(\bf{x}) =\bf{x'}\bf{A}\bf{x} \end{equation} Then

\begin{equation} \frac{\partial q(\bf{x})}{\partial x_{p}}=2\bf{A}\bf{x} \end{equation}

First

\begin{equation} \frac{\partial x_{k}}{\partial x_{p}}=\delta_{p,k}= \begin{cases} 1 & \text{if}\quad k=p \\ 0 & \text{if}\quad k\neq p \end{cases} \end{equation} Where $k,p=1,2,...,n$.

Then \begin{align} q(\bf{x}) & =\bf{x'}\bf{A}\bf{x}\\ & = \left[\begin{array}{ccc} x_{1} & ... & x_{n} \end{array}\right]\left[\begin{array}{ccc} a_{11} & & a_{1n} \\ & \ddots & \\ a_{n1} & & a_{nn} \end{array}\right]\left[\begin{array}{c} x_{1} \\ \vdots \\ x_{n}\end{array}\right] \\ & = \sum_{j=1}^{n}\sum_{i=1}^{n}a_{ij}x_{i}x_{j} \\ \end{align} The gradient will be \begin{align} \frac{\partial q(\bf{x})}{\partial x_{p}}& =\sum_{i=1}^{n}\sum_{j=1}^{n}a_{ij}\frac{\partial }{\partial x_{p}}\left[x_{i}x_{j}\right] \\ & = \sum_{i=1}^{n}\sum_{j=1}^{n}a_{ij}\left[\frac{\partial x_{i}}{\partial x_{p}}x_{j}+x_{i}\frac{\partial x_{j}}{\partial x_{p}}\right] \\ & = \sum_{i=1}^{n}\sum_{j=1}^{n}a_{ij}\left[\delta_{i,p}x_{j}+\delta_{j,p}x_{i}\right] \\ & = \sum_{j=1}^{n}a_{ij}x_{j}+\sum_{i=j=1}^{n}a_{ij}x_{j} \\ & = 2\sum_{j=1}^{n}a_{ij}x_{j} \\ & = 2\left[\begin{array}{c} \sum_{j=1}^{n}a_{1j}x_{j} \\ \sum_{j=1}^{n}a_{2j}x_{j} \\ \vdots \\ \sum_{j=1}^{n}a_{nj}x_{j} \end{array}\right] \\ & = 2\left[\begin{array}{ccc} a_{11} & & a_{1n} \\ & \ddots & \\ a_{n1} & & a_{nn} \end{array}\right]\left[\begin{array}{c} x_{1} \\ \vdots \\ x_{n}\end{array}\right] \\ & = 2\bf{A}\bf{x} \end{align}

1

There are 1 best solutions below

0
On

Let $x$,$y$ be column vectors and $A$ $B$ square matrices. Then

$$\begin{align} \frac{d( [x + A y]^´ B [x + A y])}{dA}&= \frac{d (x^´ B^´ A y ) }{dA}+ \frac{d (y^´ A^{´}B A y ) }{dA} +\frac{d (x^{'} B A y ) }{dA}\\ &= Bx y' + B' A y y' + B A y y' +B' x y' \\ &= (B+B') (x y' + y y') \end{align} $$

by properties $(70)$ and $(82)$ in the matrix cookbook

I think you can go on from here.