Chain rule derivatives matrix calculus

195 Views Asked by Bumbble Comm At 10 May 2026 - 3:58

I am going through a proof which includes the following two statements about the first and second derivatives of a function f:

$\nabla f(Sy) = S^T \nabla f(Sy) \qquad \nabla^2f(Sy)=S^T \nabla^2 f(Sy)S$

Whereby f is a twice continuously differentiable function defined over $\mathbb{R}^n$ and $S: \mathbb{R}^n \rightarrow \mathbb{R}^n$ is an invertible map.

To me it is not totally clear which intermediate steps are needed to arrive at those final solutions, using the chain rule and properties of the inner product I would write the following:

$\begin{align} \nabla f(Sy) &= \nabla f(Sy) \cdot S \\ &= S \cdot \nabla f(Sy) \\ &= S^T \nabla f(Sy) \end{align}$

$\begin{align} \nabla^2f(Sy) &= S^T (\nabla^2 f(Sy)\cdot S) \\ &= S^T (\nabla^2 f(Sy)^T S) \\ &= S^T \nabla^2 f(Sy)S \end{align}$

Maybe someone can confirm that this is in fact true or help me to get the true steps.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 25 Feb 2018 - 3:52

Suppose you know the gradient $(g)$ and Hessian $(H)$ of a function in terms of the variable $x$
$$\eqalign{ f = f(x),\,\,\,\,\, g = \frac{\partial f}{\partial x},\,\,\,\,\,\, H = \frac{\partial g}{\partial x} }$$ You are then told that $x$ is not independent, but actually depends another variable $(x = Sy).\,\,$ Note that the matrix $S$ does not need to be invertible. It might even be rectangular.

Let's find the gradient $(p)$ and Hessian $(Q)$ with respect to this new variable, by way of differentials. $$\eqalign{ df &= g:dx = g:(S\,dy) = (S^Tg):dy \cr p &= \frac{\partial f}{\partial y} = S^Tg \cr \cr dp &= S^T\,dg = S^T(H\,dx) = S^TH(S\,dy) \cr Q &=\frac{\partial p}{\partial y} = S^THS \cr\cr }$$

Chain rule derivatives matrix calculus

There are 1 best solutions below

Related Questions in CALCULUS

Related Questions in DERIVATIVES

Related Questions in MATRIX-CALCULUS

Related Questions in CHAIN-RULE

Trending Questions

Popular # Hahtags

Popular Questions