Derivative of block matrix using einstein notation

91 Views Asked by Bumbble Comm At 27 Mar 2026 - 5:15

Let $Y = [A \quad XB \quad C]$, where $A,B,C,X$ are all matrices with appropriate size. What is the derivative of $Y$ w.r.t. $X$?

The part that confuses me is that $\frac{\partial A}{\partial X}$ should be a zero 4-rank tensor. At the same time, $\frac{\partial X^{ij}B^{jk}} {\partial X^{lm}} = \frac{X^{ij}}{X^{lm}}B^{jk} = \delta_{il} \delta_{jm} B^{jk}$ should be a 3-rank tensor. It seems that the size of these tensors do not match.

Thank you in advance.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 28 Jul 2023 - 10:43

$ \def\bbR#1{{\mathbb R}^{#1}} \def\e{{\large\epsilon}} \def\ve{{\large\varepsilon}} \def\Eij{E_{ij}} \def\Xij{X_{ij}} \def\Ykl{Y_{k\ell}} \def\Bjl{B_{jp}} \def\Dki{\delta_{ki}} \def\smA{{\small A}} \def\smC{{\small C}} \def\smF{{\small F}} \def\smG{{\small G}} \def\smH{{\small H}} \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\qiq{\quad\implies\quad} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\mc#1{\left[\begin{array}{c|c}#1\end{array}\right]} $Given the matrix $$\eqalign{ Y &= \mc{ A & XB & C \\ } \\ }$$ The gradient of $Y$ with respect to a single scalar element of $X$ is $$\eqalign{ \grad Y\Xij &= \mc{ 0_\smA & \Eij\,B & 0_\smC \\ } \\ }$$ $\Eij$ is a matrix whose components are all $0$ except for the $(i,j)$ element which equals $1$.
It is the component-wise $\:$ self-gradient of $X$ $$\eqalign{ \Eij = \grad X\Xij \\ }$$ The notation $0_\smA$ is meant to denote a zero matrix the same size as the matrix $A,\,$ i.e. $$\eqalign{ &0_\smA = A-A \\ &0_\smC = C-C \\ }$$ Assume that $Y\in\bbR{m\times n}$ and the cartesian basis vectors for the two dimensions are $\e_k\in\bbR{m}$ and $\ve_\ell\in\bbR{n}$, then you can calculate the component-wise gradients as $$\eqalign{ \grad{\Ykl}{\Xij} &= \e_k^T \mc{ 0_\smA & \Eij\,B & 0_\smC \\ }\,\ve_\ell \\ }$$ Most of these scalar gradients will evaluate to zero, but for some restricted range of the index $\LR{\,\ell_\smA < \ell < \ell_\smC}$ the non-zero gradients are given by $$\eqalign{ \grad{\Ykl}{\Xij} &= \Dki\,\Bjl\qquad \LR{\,p\,=\,\ell-\ell_\smA} \\ }$$ Note that the presence of the Kronecker delta symbol means that most of the terms in this formula will also evaluate to zero.

Derivative of block matrix using einstein notation

There are 1 best solutions below

Related Questions in DERIVATIVES

Related Questions in MATRIX-CALCULUS

Related Questions in TENSORS

Related Questions in BLOCK-MATRICES

Related Questions in INDEX-NOTATION

Trending Questions

Popular # Hahtags

Popular Questions