How to solve for differentials to find Jacobian in system of equations?

Question

How to solve for differentials to find Jacobian in system of equations?

98 Views Asked by Bumbble Comm At 25 Mar 2026 - 6:48

I was reading the paper Input Convex Neural Networks and couldnt understand part of the derivation for proposition 3 (Section G in supplementary materials). I have put an image of the section below:

link to image

The authors describe using differentials to solve for the desired gradients. In the first part, for equations 35 to 37 they describe a trick of replacing dh for I. However, when getting the Jacobian for G that clearly does not work and it is not mentioned how to extend the trick for it to work in this case. I tried reading on matrix differentials, but could not understand this derivation. Does anyone know how to proceed from equation 38 to 39?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

The paper cleverly defines a vector $$\eqalign{ c=\begin{bmatrix}c_y\\c_\lambda\\c_t\end{bmatrix} = -M^{-1}\begin{bmatrix}\frac{\partial\ell}{\partial y}\\0\\0\end{bmatrix} }$$ Other than the fact that it's symmetric, the details of the $M$ matrix are not important.

The $\,c\,$ vector is used to simplify calculations, like the following for equation $(37)$ $$\eqalign{ \bigg(\frac{\partial\ell}{\partial y}\bigg)^Tdy &= \begin{bmatrix} \Big(\frac{\partial\ell}{\partial y}\Big)^T &0&0\end{bmatrix}\begin{bmatrix}dy\\d\lambda\\dt\end{bmatrix} \cr &= -\begin{bmatrix} \Big(\frac{\partial\ell}{\partial y}\Big)^T &0&0\end{bmatrix}M^{-1}\begin{bmatrix}0\\dh\\0\end{bmatrix} \cr &= c^T\begin{bmatrix}0\\dh\\0\end{bmatrix} = c_\lambda^Tdh = c_\lambda:dh \cr \bigg(\frac{\partial\ell}{\partial y}\bigg)^T\frac{\partial y}{\partial h} &= c_\lambda \cr\cr }$$ The calculation to obtain equation $(39)$ is similar, using terms involving $dG$ instead of $dh$ $$\eqalign{ \bigg(\frac{\partial\ell}{\partial y}\bigg)^Tdy &= \begin{bmatrix} \Big(\frac{\partial\ell}{\partial y}\Big)^T &0&0\end{bmatrix}\begin{bmatrix}dy\\d\lambda\\dt\end{bmatrix} \cr &= -\begin{bmatrix} \Big(\frac{\partial\ell}{\partial y}\Big)^T &0&0\end{bmatrix}M^{-1}\begin{bmatrix}dG^T\lambda\\dG\,y\\0\end{bmatrix} \cr &= c^T\begin{bmatrix}dG^T\lambda\\dG\,y\\0\end{bmatrix} \cr\cr &= c_y^TdG^T\lambda + c_\lambda^TdG\,y \cr &= \lambda^TdG\,c_y + c_\lambda^TdG\,y \cr &= \Big(\lambda c_y^T + c_\lambda y^T\Big):dG \cr \bigg(\frac{\partial\ell}{\partial y}\bigg)^T\frac{\partial y}{\partial G} &= \lambda c_y^T + c_\lambda y^T \cr\cr }$$ NB: The authors use a different layout format for gradients, which is why mine are transposed compared to those in the paper.

Also, I've used subscripts for the components of the $c$ vector; using superscripts is just ugly.

How to solve for differentials to find Jacobian in system of equations?

There are 1 best solutions below

Related Questions in MATRICES

Related Questions in DERIVATIVES

Related Questions in OPTIMIZATION

Related Questions in MATRIX-CALCULUS

Trending Questions

Popular # Hahtags

Popular Questions