Difference between gradient and Jacobian

89.8k Views Asked by At

Could anyone explain in simple words (and maybe with an example) what the difference between the gradient and the Jacobian is?

The gradient is a vector with the partial derivatives, right?

3

There are 3 best solutions below

10
On

These are two particular forms of matrix representation of the derivative of a differentiable function $f,$ used in two cases:

  • when $f:\mathbb{R}^n\to\mathbb{R},$ then for $x$ in $\mathbb{R}^n$, $$\mathrm{grad}_x(f):=\left[\frac{\partial f}{\partial x_1}\frac{\partial f}{\partial x_2}\dots\frac{\partial f}{\partial x_n}\right]\!\bigg\rvert_x$$ is the matrix $1\times n$ of the linear map $Df(x)$ exprimed from the canonical base of $\mathbb{R}^n$ to the canonical base of $\mathbb{R}$ (=(1)...). Because in this case this matrix would have only one row, you can think about it as the vector $$\nabla f(x):=\left(\frac{\partial f}{\partial x_1},\frac{\partial f}{\partial x_2},\dots,\frac{\partial f}{\partial x_n}\right)\!\bigg\rvert_x\in\mathbb{R}^n.$$ This vector $\nabla f(x)$ is the unique vector of $\mathbb{R}^n$ such that $Df(x)(y)=\langle\nabla f(x),y\rangle$ for all $y\in\mathbb{R}^n$ (see Riesz representation theorem), where $\langle\cdot,\cdot\rangle$ is the usual scalar product $$\langle(x_1,\dots,x_n),(y_1,\dots,y_n)\rangle=x_1y_1+\dots+x_ny_n.$$
  • when $f:\mathbb{R}^n\to\mathbb{R}^m,$ then for $x$ in $\mathbb{R}^n$, $$\mathrm{Jac}_x(f)=\left.\begin{bmatrix}\frac{\partial f_1}{\partial x_1}&\frac{\partial f_1}{\partial x_2}&\dots&\frac{\partial f_1}{\partial x_n}\\\frac{\partial f_2}{\partial x_1}&\frac{\partial f_2}{\partial x_2}&\dots&\frac{\partial f_2}{\partial x_n}\\ \vdots&\vdots&&\vdots\\\frac{\partial f_m}{\partial x_1}&\frac{\partial f_m}{\partial x_2}&\dots&\frac{\partial f_m}{\partial x_n}\\\end{bmatrix}\right|_x$$ is the matrix $m\times n$ of the linear map $Df(x)$ exprimed from the canonical base of $\mathbb{R}^n$ to the canonical base of $\mathbb{R}^m.$

For example, with $f:\mathbb{R}^2\to\mathbb{R}$ such as $f(x,y)=x^2+y$ you get $\mathrm{grad}_{(x,y)}(f)=[2x \,\,\,1]$ (or $\nabla f(x,y)=(2x,1)$) and for $f:\mathbb{R}^2\to\mathbb{R}^2$ such as $f(x,y)=(x^2+y,y^3)$ you get $\mathrm{Jac}_{(x,y)}(f)=\begin{bmatrix}2x&1\\0&3y^2\end{bmatrix}.$

4
On

The gradient vector of a scalar function $f(\mathbf{x})$ that maps $\mathbb{R}^n\to\mathbb{R}$ where $\mathbf{x}=<x_1,x_2,\ldots,x_n>$ is written as $$\nabla f(\mathbf{x})=\frac{\partial f(\mathbf{x})}{\partial x_1}\hat{x}_1+\frac{\partial f(\mathbf{x})}{\partial x_2}\hat{x}_2+\ldots+\frac{\partial f(\mathbf{x})}{\partial x_n}\hat{x}_n$$

Whereas the Jacobian is taken of a vector function $\mathbf{f}(\mathbf{x})$ that maps $\mathbb{R}^n\to\mathbb{R}^m$, where $\mathbf{f}=<f_1,f_2,\ldots,f_m>$ and $\mathbf{x}=<x_1,x_2,\ldots,x_n>$. The Jacobian is written as

$$J_\mathbf{f} = \frac{\partial (f_1,\ldots,f_m)}{\partial(x_1,\ldots,x_n)} = \left[ \begin{matrix} \frac{\partial f_1}{\partial x_1} & \cdots & \frac{\partial f_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial f_m}{\partial x_1} & \cdots & \frac{\partial f_m}{\partial x_n} \end{matrix} \right]$$

Note that when $m=1$ the Jacobian is same as the gradient because it is a generalization of the gradient.

The Jacobian determinant can be used for changes of variables because it can be viewed as the ratio of an infinitesimal change in the variables of one coordinate system to another. This requires that the function $\mathbf{f}(\mathbf{x})$ maps $\mathbb{R}^n\to\mathbb{R}^n$, which produces an $n\times n$ square matrix for the Jacobian. For example

$$\iiint_R f(x,y,z) \,dx\,dy\,dz = \iiint_S f(x(u,v,w),y(u,v,w),z(u,v,w))\left|\frac{\partial (x,y,z)}{\partial(u,v,w)}\right|\,du\,dv\,dw$$

where the Jacobian $J_\mathbf{g}$ is taken of the function

$$\mathbf{g}(u,v,w)=x(u,v,w)\hat{\imath}+y(u,v,w)\hat{\jmath}+z(u,v,w)\hat{k}$$

and the areas $R$ and $S$ correspond to each other.

3
On

The gradient in a general coordinate system depends on the metric tensor but the Jacobian matrix consists of only the partial derivatives.

The gradient of a vector field is given by:

$\nabla\mathbf{f}=g^{jk}\frac{\partial f^{i}}{\partial x^{j}}\mathbf{e}_{i}\otimes\mathbf{e}_{j}$,

where the Einstein summation notation is implied, $g^{jk}$ are the metric tensor elements evaluated from the Jacobian matrix consisting of the partial derivatives of the coordinate transformation from the Cartesian coordinate system. In Cartesian coordinate system, this equals to exactly the transpose of the Jacobian matrix, which is given by, regardless of the metric tensor,

$\left\{ J\mathbf{f}\right\} _{i,j}=\frac{\partial f^{i}}{\partial x^{j}}$.

For example, for the spherical coordinate system with the coordinate $\mathbf z$, we have

$x_0=z_0\sin z_1\sin z_2,\ x_1=z_0 \cos z_1 \sin z_2,\ x_2=z_0\cos z_2$, where $\mathbf x$ is the coordinate in the Cartesian coordinate system.

Given a vector function $f$, Each column of gradient is given by $\left\{ \nabla\mathbf{f}\right\} _{:,i}=\frac{\partial f^{i}}{\partial\mathbf{z}_{0}}\hat{\mathbf{z}}_{0}+\frac{1}{\mathbf{z}_{0}}\frac{\partial f^{i}}{\partial\mathbf{z}_{1}}\hat{\mathbf{z}}_{1}+\frac{1}{\mathbf{z}_{0}\sin\mathbf{z}_{2}}\frac{\partial f^{i}}{\partial\mathbf{z}_{2}}\hat{\mathbf{z}}_{2}$

or

$\begin{bmatrix}\frac{\partial f^{i}}{\partial\mathbf{z}_{0}} & \frac{1}{\mathbf{z}_{0}}\frac{\partial f^{i}}{\partial\mathbf{z}_{1}} & \frac{1}{\mathbf{z}_{0}\sin\mathbf{z}_{2}}\frac{\partial f^{i}}{\partial\mathbf{z}_{2}}\end{bmatrix}^\mathrm{T}$

but the each row of Jacobian is still given by

${\left\{ J\mathbf{f}\right\} _{i,:}}=\begin{bmatrix}\frac{\partial f^{i}}{\partial\mathbf{z}_{0}} & \frac{\partial f^{i}}{\partial\mathbf{z}_{1}} & \frac{\partial f^{i}}{\partial\mathbf{z}_{2}}\end{bmatrix}$

The reason for this is because the Jacobian matrix is applied to solve integrals by substitution where the determinant of the Jacobian matrix is needed. It is also used to transform partial derivatives into partial derivatives of another coordinate system. Another application is to evaluate the metric tensor as mentioned before.

PS: I hope this answer doesn't get deleted because I elaborated the one liner I initially had but probably was not very convincing.