Find $\frac{\partial}{\partial a_{i,j}} \left[ \mathbf{c}^T (s \mathbf{I} - \mathbf{A})^{-1} \mathbf{b} \right] $

56 Views Asked by At

I want to find:

$$ \frac{d}{d a_{i,j}} H(s, \mathbf{A}) = \frac{\partial}{\partial a_{i,j}} \left[ \mathbf{c}^T (s \mathbf{I} - \mathbf{A})^{-1} \mathbf{b} \right] $$

where $a_{i,j}$ is the $i,j$-th element of $\mathbf{A}$, $s$ is a constant and $\mathbf{c}, \mathbf{b}$ do not depend on $a_{i,j}$. We know that since $H(\cdot)$ is a scalar function, according to the matrix cookbook, we have:

$$ \frac{d}{d a_{i,j}} H(\cdot) = \text{tr}\left( \left[\frac{\partial H}{\partial \mathbf{A}}\right]^T \frac{\partial \mathbf{A}}{\partial a_{i,j}} \right) $$

Now $\frac{\partial \mathbf{A}}{\partial a_{i,j}}$ is easy but I am wondering about $\frac{\partial H}{\partial \mathbf{A}}$. What is this partial derivative?


Attempt at an answer:

Setting up for the chain rule, we setup:

$$ H() = \mathbf{c}^{T} g(\mathbf{A}) \mathbf{b} $$

where $g(\mathbf{A}) = (s \mathbf{I} - \mathbf{A})^{-1}$. Thus:

$$ \frac{\partial H}{\partial \mathbf{A}} = \frac{\partial H}{ \partial g(\mathbf{A})} \frac{\partial g(\mathbf{A})}{ \partial \mathbf{A}} $$

Well, $\frac{\partial H}{ \partial g(\mathbf{A})} = \mathbf{c} \mathbf{b}^T$. So we just need to find $ \frac{\partial g(\mathbf{A})}{ \partial \mathbf{A}} $. We know from the matrix cookbook that:

$$ \frac{\partial \mathbf{Y}^{-1}}{\partial x} = - \mathbf{Y}^{-1} \frac{\partial \mathbf{Y}}{\partial x} \mathbf{Y}^{-1} $$

So then, applying this, we have:

$$ \frac{\partial g(\mathbf{A})}{ \partial \mathbf{A}} = - {(s \mathbf{I} - \mathbf{A})}^{-1} \frac{ \partial (s \mathbf{I} - \mathbf{A})}{ \partial \mathbf{A} } (s \mathbf{I} - \mathbf{A})^{-1} = - {(s \mathbf{I} - \mathbf{A})}^{-1} (-\mathbf{I}) (s \mathbf{I} - \mathbf{A})^{-1} = (s \mathbf{I} - \mathbf{A})^{-2} $$

Therefore:

$$ \frac{d}{d a_{i,j}} H(\cdot) = \text{tr}\left( \left[\frac{\partial H}{\partial \mathbf{A}}\right]^T \frac{\partial \mathbf{A}}{\partial a_{i,j}} \right) $$

$$ \frac{d}{d a_{i,j}} H(\cdot) = \mathbf{b}^{T} \mathbf{c} \cdot \text{tr}\left( (s \mathbf{I} - \mathbf{A})^{-2 T} \frac{\partial \mathbf{A}}{\partial a_{i,j}} \right) $$

Is that correct or am I missing something?

1

There are 1 best solutions below

0
On BEST ANSWER

You can compute the matrix derivative.

Let $\mathbf{D}=s \mathbf{I} - \mathbf{A}$, so the quantity you are differentiating can be written as $$ \phi = \mathbf{c}\mathbf{b}^T: \mathbf{D}^{-1} $$ where the colon operator : denotes the Frobenius inner product.

This implies \begin{eqnarray} d\phi &=& -\mathbf{c}\mathbf{b}^T: \mathbf{D}^{-1}(d\mathbf{D})\mathbf{D}^{-1} = \mathbf{D}^{-T} \mathbf{c}\mathbf{b}^T \mathbf{D}^{-T}: d\mathbf{A} \end{eqnarray} The matrix gradient is thus of the form $\mathbf{u}\mathbf{v}^T$ with $\mathbf{u}=\mathbf{D}^{-T} \mathbf{c}, \mathbf{v}=\mathbf{D}^{-1} \mathbf{b}$.