Derivative of an arbitrary long product of matrices with respect to a single variable, where each matrix is dependent on the variable.

62 Views Asked by At

I just had a question about matrix derivatives which I couldn't find an answer to in "the matrix cookbook"

Let $F(\alpha)=X_n(\alpha)X_{n-1}(\alpha)....X_{2}(\alpha)X_{1}(\alpha)$

If $n=2$

then $\frac{\partial F(\alpha)}{\partial \alpha} = X_2 \frac{\partial X_1(\alpha)}{\partial \alpha} + \frac{\partial X_2(\alpha)}{\partial \alpha}X_1$

When I go to the general case $n=N$,

I start to apply the chain rule, but it seems like a recursive loop is formed, but I cannot see a clean way to write the patten in mathematical notation.

Is there a clean formula for what the derivative of this matrix product would be when $n=N$?

If anyone could offer their input, it would be greatly appreciated.

Thank you.

2

There are 2 best solutions below

0
On BEST ANSWER

By induction on $N$, I'll show that $$\frac{\partial}{\partial\alpha} (X_1(\alpha)\dots X_n(\alpha)) = \sum_{i=1}^N X_1(\alpha)\dots X_{i-1}(\alpha)\frac{\partial X_i}{\partial\alpha}(\alpha)X_{i+1}(\alpha)\dots X_N(\alpha)\,. $$

The base case $N=1$ is obvious.

Below I will write $X_i$ instead of $X_i(\alpha)$ to increase readability.

Now we assume this holds for some $N\in\mathbb N$ and try to show this holds for $N+1$. Thus we look at $\frac{\partial}{\partial\alpha} X_1\dots X_{N+1}$. As you have noted in your question, for two matrices $X$ and $Y$, $\frac{\partial}{\partial\alpha} (XY) = \frac{\partial X}{\partial\alpha} Y + X\frac{\partial Y}{\partial\alpha}$. Thus with $X=X_1\dots X_N$ and $Y=X_{N+1}$ this yields

$$\begin{align}\frac{\partial}{\partial\alpha}(X_1\dots X_{N+1}) &= \frac{\partial}{\partial\alpha} (X_1\dots X_N)X_{N+1} + X_1\dots X_N \frac{\partial X_{N+1}}{\partial\alpha} \\ &= \sum_{i=1}^N X_1\dots X_{i-1}\frac{\partial X_i}{\partial\alpha}X_{i+1}\dots X_N X_{N+1} + X_1\dots X_N\frac{\partial X_{N+1}}{\partial\alpha}\\ &=\sum_{i=1}^{N+1} X_1\dots X_{i-1}\frac{\partial X_i}{\partial\alpha}X_{i+1}\dots X_{N+1}\,,\end{align}$$

which concludes the induction step, and thus the proof.

0
On

$ \def\LR#1{\left(#1\right)} \def\c#1{\color{red}{#1}} $ Use the product symbol $$F = \LR{\prod_{i=\tt1}^{n} X_i} = X_1X_2\cdots X_n$$ and the convention that if the lower bound exceeds the upper bound, then the product evaluates to the identity matrix, e.g. $$\LR{\prod_{i=3}^{2} X_i} \doteq I$$

Then the derivative of the product can be written as a sum over such product symbols $$dF = {\Large\sum_{j=\tt1}^n} \LR{\prod_{i=\tt1}^{j-\tt1} X_i}\, dX_j \LR{\prod_{k=j+\tt1}^nX_k}$$

Or skip the convention and write the $(j=\tt1)$ and $(j=n)$ terms $\,\c{{\rm explicitly}}$ $$ dF = \c{dX_{\tt1} \LR{\prod_{k=2}^nX_k}} + {\Large\sum_{j=2}^{n-\tt1}} \LR{\prod_{i=\tt1}^{j-\tt1} X_i}\, dX_j \LR{\prod_{k=j+\tt1}^nX_k} + \c{\LR{\prod_{i=\tt1}^{n-\tt1} X_i}\, dX_n} $$