I wish to find the derivative $L^{-1}xx^TL^{-1}$ with respect to the symmetric positive definite matrix $L$, and $x$ is a vector. How do I proceed?
Derivative of $L^{-1}xx^TL^{-1}$ wrt L
62 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
$
\def\E{{\cal E}} \def\d{\delta} \def\L{L^{-1}}
\def\LR#1{\left(#1\right)}
\def\op#1{\operatorname{#1}}
\def\vc#1{\op{vec}\LR{#1}}
\def\trace#1{\op{Tr}\LR{#1}}
\def\frob#1{\left\| #1 \right\|_F}
\def\qiq{\quad\implies\quad} \def\k{\otimes}
\def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}}
\def\c#1{\color{red}{#1}}
\def\CLR#1{\c{\LR{#1}}}
$Given the symmetric matrix-valued function
$$\eqalign{
F &= \L xx^T\L \;\doteq\; F^T \qquad \qquad \qquad \qquad \qquad \qquad
}$$
calculate its differential with respect to $L$ (which is also symmetric)
$$\eqalign{
dF &= \CLR{d\L}xx^T\L + \L xx^T\CLR{d\L} \\
&= \CLR{-\L\,dL\:\L}xx^T\L + \L xx^T\CLR{-\L\,dL\:\L} \\
&= -\LR{\L\,dL\:F + F\:dL\:\L} \\
}$$
Vectorize this expression and rearrange it into a matrix-valued gradient
$$\eqalign{
d\ell &= \vc{dL} \\
df &= \vc{dF} \;=\; -\LR{F\k\L + \L\k F} d\ell \\
\grad{f}{\ell} &= -\LR{F\k\L + \L\k F} \\\\
}$$
Another approach is to introduce the fourth-order identity tensor $\E$ with
components
$$\eqalign{
\E_{ijkl} = \grad{L_{ij}}{L_{kl}} \;=\; \d_{ik}\,\d_{jl}
}$$
and calculate a tensor-valued gradient
$$\eqalign{
dF &= -\LR{\L\E F + F\E\L}:dL \\
\grad{F}{L} &= -\LR{\L\E F + F\E\L} \\\\
}$$
Yet another approach is to write the differential using index notation.
This yields the scalar components of the gradient
$$\eqalign{
\grad{F_{ij}}{L_{kl}} &= -\LR{\L_{ik}F_{jl} + F_{ik}\L_{jl}} \\
}$$
First, note that the derivative of $g(L)=L^{-1}$ is the linear function $Δ↦ -L^{-1}ΔL^{-1}$, which follows from
$$\begin{aligned} g(L+Δ) &= (L+{Δ})^{-1} \\&= (L(+L^{-1}{Δ})^{-1} \\&= (+L^{-1}{Δ})^{-1}L^{-1} \\&= \big( - L^{-1}{Δ} + (‖Δ‖²)\big) L^{-1} \qquad\text{(Neumann Series)} \\&= g(L) - L^{-1} {Δ} L^{-1} + (‖Δ‖²) \end{aligned}$$
Consider the function $f(L) = L^{-1} xx^⊤ L^{-1}$, then
$$\begin{aligned} f(L+Δ) &= (L+{Δ})^{-1}xx^⊤(L+Δ)^{-1} \\&= (L^{-1} - L^{-1}ΔL^{-1} + (‖Δ‖^2))xx^⊤(L^{-1} - L^{-1}ΔL^{-1} + (‖Δ‖^2)) \\&= f(L) - L^{-1}ΔL^{-1}xx^⊤L^{-1} - L^{-1}xx^⊤L^{-1}ΔL^{-1} +(‖Δ‖^2) \end{aligned}$$
Hence, the derivative of $f$ at $L$ is the linear function $$\boxed{Δ ⟼ - L^{-1}ΔL^{-1}xx^⊤L^{-1} - L^{-1}xx^⊤L^{-1}ΔL^{-1}}$$
This function can be represented as a 4d-tensor, using the formula $AXB^⊤ = (A⊗B)⋅X$, $^{(*)}$ where $(A⊗B)_{ij, kl} ≕ A_{ik}B_{jl}$ is a 4d tensor and $(A⊗B)⋅X ≕ ∑_{kl}(A⊗B)_{ij, kl}X_{kl}$ is a 2d tensor contraction. Using the symmetry of $L$ we have:
$$\begin{align} f(L) &= [Δ⟼- L^{-1}ΔL^{-1}xx^⊤L^{-1} - L^{-1}xx^⊤L^{-1}ΔL^{-1}] &&∈ \text{Lin}(ℝ^{n×n},ℝ^{n×n}) \\&≅ - L^{-1}⊗L^{-1}xx^⊤L^{-1} - L^{-1}xx^⊤L^{-1}⊗L^{-1} &&∈ ℝ^{n×n}⊗(ℝ^{n×n})^* \end{align}$$
(*): Note that one of the linked posts has $ = (B^⊤⊗A)X$, this is just a result of different convention on how $⊗$ is defined.