How are both of these true: $J = \nabla f ^T $, and also $\nabla f = J^T f$?

Question

How are both of these true: $J = \nabla f ^T $, and also $\nabla f = J^T f$?

391 Views Asked by Bumbble Comm At 10 May 2026 - 7:03

From questions such as this one: Gradient and Jacobian row and column conventions I understand that for cases where $f$ maps from $\mathbb{R}^n$ into $\mathbb{R}$ , i.e. $f: \mathbb{R}^n \rightarrow \mathbb{R}$, the transpose of the gradient is equal to the jacobian: $J = \nabla f ^T $. Again, see Gradient and Jacobian row and column conventions as my resource.

However, I am still occasionally confused by this, because when finding an expression of the gradient for when $f: \mathbb{R}^n \rightarrow \mathbb{R}^m$ I see expressions such as $\nabla f = J^T f$. An example of this is in Nocedal and Wright, first edition on page 260:

Question is how are both of these true: $J = \nabla f ^T $, and also $\nabla f = J^T f$ ?

Original Q&A

There are 3 best solutions below

**Bumbble Comm** · Answer 1 · 2018-06-11 01:11:28

Bumbble Comm On 11 Jun 2018 - 1:11

If $A=B^T$, then $B=A^T$. It is simply a consequence of the fact that ${\left(A^T\right)}^T=A$.

**Bumbble Comm** · Answer 2 · 2018-06-11 01:54:52

They can both be true because the $f's$ and corresponding Jacobians are different. Nonlinear least squares has its own notation and conventions for what the Jacobian is (is applied to, namely to residual functions which are squared and summed and multiplied by 1/2).

I am looking at the 2nd edition of Nocedal and Wright, whereas you must apparently be looking at the 1st edition. Perhaps there is a typo in that ediition uding f where there should have been an r (see next paragraph).

In the Nocedal and Wright extract pertaining to a nonlinear least squares problem, f = 1/2 of sum squared residuals = $\frac{1}{2}\Sigma_{i=1}^nr_i^2$, where $r_i$ are the individual residual functions. The Jacobian J, in this nonlinear least squares context, is the matrix of partial derivatives of $r_i$ with respect to variable $x_j$, o.e., the ith row of $J$ is the transpose of the gradient of $r_i$. Then it works out that gradient of f = $J^Tr$, where $r = $ column vector of $r_i's$ So this is true under these definitions and conventions, which differ from life outside nonlinear least squares.

**Bumbble Comm** · Answer 3 · 2021-10-01 08:32:03

First, let's clarify the notation first. To be precise, the notation for Jacobian of $f$ is $J_f$, and the notation for the gradient is $\nabla f^T$.

Then, regarding $f=(f_1,\dots,f_m)=f(x)=(f_1(x), \dots, f_m(x))$ as a $m$-dimensional coloum vector, $$ J_f = \begin{bmatrix} \nabla f_1^T \\ \cdots \\ \nabla f_m^T \end{bmatrix} $$ where each $\nabla f_i$ is a column vector that consist of partial derivatives in the conventional way.

On the other hand, to consistently interpret $\nabla f^T$, first regard $f^T$ as a row vector $f^T = [f_1, \dots, f_m]$. Then, $$ \nabla f^T = [\nabla f_1, \dots, \nabla f_m]. $$ What we really have is $J_f^T = \nabla f^T$ and $J_f = (\nabla f^T)^T$.

$\nabla f$ cannot be defeind in a consistent way onece we regard $f$ as a column vector and $\nabla$ as an operation applied to a scalr function to produce a column vector.

How are both of these true: $J = \nabla f ^T $, and also $\nabla f = J^T f$?

There are 3 best solutions below

Related Questions in DERIVATIVES

Related Questions in DIFFERENTIAL-GEOMETRY

Related Questions in DIFFERENTIAL

Related Questions in GRADIENT-DESCENT

Related Questions in JACOBIAN

Trending Questions

Popular # Hahtags

Popular Questions