Are there any generalization of chain rule of differentiation to tensors? For example, how can I differentiate f(g(X)) where: $g: Matrix(d_1, d_2) \to Matrix(d_3, d_4)$ and $f: Matrix(d_3, d_4) \to Matrix (d_5, d_6)$.
P.S. I understand how to compute derivative, I want a rule which takes a tensor derivative of f and a tensor derivative of g and combines them in a short step.
Use Leibniz' Rule $$(M_{ij}N_{jk})_{,r}=M_{ij,r}N_{jk}+M_{ij}N_{jk,r}$$ where ${}_{,r}$ means differentiation. One writes $M_{ij}N_{jk}$ (for matrix multiplication) which is used to calculate composition.