Vector derivative of $ f(x)= (A+B\operatorname{diag}(x))^{-1} b$

391 Views Asked by At

How to find a vector derivative with respect to $x\in \mathbb{R}^n$ of \begin{align}f(x)= (A+B \operatorname{diag}(x))^{-1} b \end{align} where $\operatorname{diag}(x)$ is a diagonal matrix where $x$ is a main diagonal, $A\in \mathbb{R}^{n \times n}$, $B\in \mathbb{R}^{n \times n}$, $b \in \mathbb{R}^n$.

This question is similar to what I have asked here. However, there are some differences with matrix multiplication that lead to some confusion for me.

I am also wondering if this can be shown using $\epsilon$-definition of the derivative.

2

There are 2 best solutions below

9
On BEST ANSWER

You can obtain the derivative by the chain rule. Let \begin{equation} \begin{array}{ll} \cr\Phi\colon &GL_n({\mathbb R})\to {\mathbb R}^{n\times n}\cr &U \mapsto U^{-1} \end{array} \end{equation} \begin{equation} \begin{array}{l}\cr g \colon &{\mathbb R}^n\to {\mathbb R}^{n\times n}\cr &x \mapsto A + B \operatorname{diag}(x) \end{array} \end{equation} Then $\Phi'(U)\cdot H = -U^{-1} H U^{-1}$ and $g'(x).h = B \operatorname{diag}(h)$, hence by the chain rule \begin{equation} f'(x)\cdot h =- ( A + B \operatorname{diag}(x))^{-1} B \operatorname{diag}(h)( A + B \operatorname{diag}(x))^{-1} b \end{equation} In terms of partial derivatives, it means that \begin{equation} \frac{\partial f}{\partial x_i} =- ( A + B \operatorname{diag}(x))^{-1} B E_{i,i}( A + B \operatorname{diag}(x))^{-1} b \end{equation} where $E_{i, i}$ is the matrix of which all terms are zero but the term of at position $(i, i)$ which value is $1$, or equivalently $E_{i, i} = e_i e_i^T$ where $e_i$ is the i-th basis column vector.

In particular, one sees that $e_i^T ( A + B \operatorname{diag}(x))^{-1} b$ is a scalar, the $i$-th component of $f(x)$ and it follows easily that the Jacobian matrix of $f$, which columns are the vectors $\frac{\partial f}{\partial x_i}$ must be \begin{equation} \partial f = - ( A + B \operatorname{diag}(x))^{-1} B \operatorname{diag}(f(x)) \end{equation}

4
On

Write the function as $\;f = M^{-1}b$
where $$\eqalign{ &M=A+BX,\quad X={\rm Diag}(x),\quad F={\rm Diag}(f) \\ &Xf= Fx = f\odot x\qquad(\odot{\rm \,denotes\,Hadamard\,Product}) \\ }$$ then calculate the differential and gradient of the function $$\eqalign{ df &= dM^{-1}b \\&= -M^{-1}\,dM\,M^{-1}b \\ &= -M^{-1}\,dM\,f \\ &= -M^{-1}(B\;dX)\,f \\ &= -M^{-1}BF\,dx\\ \frac{\partial f}{\partial x} &= -M^{-1}BF \;=\; -\Big(A+B\,{\rm Diag}(x)\Big)^{-1}B\;{\rm Diag}(f) \\ }$$

Update

The following derivation was requested by a commenter.
Write the definition of the matrix inverse and take its differential. $$\eqalign{ I &= M^{-1}M \\ 0 &= dM^{-1}M + M^{-1}dM \\ &= dM^{-1} + M^{-1}dM\,M^{-1} \\ dM^{-1} &= -M^{-1}dM\,M^{-1} \\ }$$