Jacobian of $A (A^\top X A)^{-1} A^\top$

235 Views Asked by At

Let $A\in\mathbb{R}^{n\times m}$, $n\geq m$, be a full column rank matrix, and consider the function \begin{align} f&\colon \mathbb{R}^{n\times n} \to \mathbb{R}^{n\times n}\\ & X\mapsto A (A^\top X A)^{-1} A^\top, \end{align} where $\bullet^\top$ denotes transposition.

Assuming that $(A^\top X A)^{-1}$ exists, I'm interested in the computation of the Jacobian matrix of $f$, i.e. $$\tag{1}\label{a} \mathbf{J}[f] = \left[\frac{\partial f(X)}{\partial X_{ij}}\right]\in\mathbb{R}^{n^2\times n^2}. $$

I know that there exists a closed form expressions for the Jacobian of the inverse, namely $\mathbf{J}[X^{-1}]=-(X^{-\top} \otimes X^{-1})$ (see e.g. here, page 5). Hence, I wonder whether a similar closed-form expression can be derived for \eqref{a}.

Thanks in advance.

2

There are 2 best solutions below

4
On BEST ANSWER

Given $\mathrm A \in \mathbb R^{n \times m}$, matrix-valued function $\mathrm F : \mathbb R^{n \times n} \to \mathbb R^{n \times n}$ is defined as follows

$$\mathrm F (\mathrm X) := \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top}$$

Hence,

$$\mathrm F (\mathrm X + h \mathrm V) = \mathrm A \left( \mathrm A^{\top} (\mathrm X + h \mathrm V) \mathrm A \right)^{-1} \mathrm A^{\top} = \cdots = \mathrm F (\mathrm X) - h \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \mathrm V \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top}$$

Thus, the directional derivative of $\mathrm F$ in the direction of $\mathrm V$ at $\mathrm X$ is the matrix-valued function

$$- \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \mathrm V \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top}$$

Making $\mathrm V = \mathrm e_i \mathrm e_j^{\top}$, we obtain

$$\partial_{x_{ij}} \mathrm F (\mathrm X) = - \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \mathrm e_i \mathrm e_j^{\top} \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} = \color{blue}{- \mathrm F (\mathrm X) \, \mathrm e_i \mathrm e_j^{\top} \mathrm F (\mathrm X)}$$

which is a multiple of the outer product of the $i$-th column and $j$-th row of $\mathrm F (\mathrm X)$.

Vectorizing the directional derivative, we obtain

$$\mbox{vec} \left( - \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \mathrm V \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \right) = \color{blue}{- \left( \mathrm A \left( \mathrm A^{\top} \mathrm X^{\top} \mathrm A \right)^{-1} \mathrm A^{\top} \otimes \mathrm A \left( \mathrm A^{\top} \mathrm X \mathrm A \right)^{-1} \mathrm A^{\top} \right)} \mbox{vec} (\mathrm V)$$

0
On

For typing convenience, define the matrix $$\eqalign{ \def\B{B^{-1}} \def\p{\partial} \def\qq{\qquad\qquad} \def\vc{\operatorname{vec}} \def\grad#1#2{\frac{\p #1}{\p #2}} B &= A^TXA \\ dB &= A^T\;dX\;A \\ d\B &= -\B\;dB\;\B \;=\; -\B A^T\;dX\;A\B \\ }$$ and use it to write the function and calculate its differential $$\eqalign{ F &= A\B A^T \\ dF &= A\;d\B\,A^T \;=\; -F\;dX\;F \qq \\ }$$ from which one may obtain the componentwise matrix-valued gradients $$\eqalign{ \grad F{X_{ij}} &= -F\,E_{ij}\,F \qq\qq\qq \\ }$$ or a vectorized matrix-valued gradient $$\eqalign{ \grad {\vc(F)}{\vc(X)} &= -{F^T\otimes F} \qq\qq\quad \\ }$$ or a tensor-valued gradient $$\eqalign{ \grad FX &= -F\,{\cal E}\,F^T \qq\qq\qq \\ }$$ where ${\cal E}$ is a fourth-order tensor whose components can be written in terms of Kronecker delta symbols as $\,{\cal E}_{ijk\ell} = \delta_{ik}\,\delta_{j\ell}\;$ while $\;E_{ij}$ is a matrix whose components are all zero except for the $(i,j)$ component which is equal to one.