Jacobian of right multiplication by a constant matrix.

Question

Jacobian of right multiplication by a constant matrix.

1.7k Views Asked by Bumbble Comm At 28 Mar 2026 - 10:16

I am referring 4th page of this article.

I know that if I have column vectors for $y$ and $x$ s.t. $\vec{y}=W\vec{x}$, then

$$\frac{d\vec{y}}{d\vec{x}}=W.$$

I am also trying to see what happens if I have row vectors, and $\vec{y}=\vec{x}V.$

This time, I think it should be $\frac{d\vec{y}}{d\vec{x}}=V^T$.

But, this article says $\frac{d\vec{y}}{d\vec{x}}=V$ in this case, too.

Which one is right? Is my thought right or am I misunderstanding?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 04 Nov 2017 - 1:19

The correct answer is $V^T$ for all definitions of derivatives that I know. From my opinion, the mentioned paper is not the best thing to read.

**Bumbble Comm** · Accepted Answer

Hope you brought your spelunking gear... you're about to enter a rabbit-hole.

The unfortunate truth is that there is not universal agreement about what notation like $\frac{d\mathbf{y}}{d\mathbf{x}}$ even means in higher dimensions, and until you agree on the notation and conventions, there is no way to answer your question.

What everybody does agree about is:

Suppose you have a sufficiently well-behaved function $f:\mathbb{R}^n\to\mathbb{R}^m$, a point $\mathbf{p}\in\mathbb{R}^n$, and a vector $\delta \mathbf{p}\in\mathbb{R}^n$. Then you can define the directional derivative of $f$ at the point $\mathbf{p}$ in the direction $\delta \mathbf{p}$ by $$(D_{\delta\mathbf{p}}f)(\mathbf{p}) = \frac{d}{dt} f(\mathbf{p}+t\delta \mathbf{p})\Bigg\vert_{t\to 0}.$$ Intuitively, you ignore everything about $f$ except how it changes if you start moving in the $\delta\mathbf{p}$ direction. You're now left with a 1D calculus problem.(*) What is the "type" of this directional derivative? It's a vector in $\mathbb{R}^m$ (an "infinitesimal change" in the image of $f$).
If the function $f$ is differentiable, the directional derivative is linear in the direction parameter: for any $\mathbf{p}$ and any vectors $\delta\mathbf{p}_1, \delta\mathbf{p}_2$ and $\alpha\in\mathbb{R}$, $$(D_{\delta\mathbf{p}_1 + \alpha \delta\mathbf{p}_2}f)(\mathbf{p}) = (D_{\delta\mathbf{p}_1}f)(\mathbf{p}) + \alpha (D_{\delta\mathbf{p}_2}f)(\mathbf{p}).$$ The most familiar situation is probably the case $n=2,m=1$, where this fact implies that locally, every differentiable $z(x,y)$ looks like a plane if you "zoom in far enough."
At every point $\mathbf{p}$, there is linear function that maps from vectors $\delta \mathbf{p}$ ("infinitesimal changes in $\mathbf{p}$") to vectors $(D_{\delta \mathbf{p}}f)(\mathbf p)$ ("infinitesimal changes in $f(\mathbf{p})$"). It may help to think of the case $n=m=1$. In this trivial case, for functions $f(x):\mathbb{R}\to\mathbb{R}$, the linear map still exists, as is given by $$\delta x \mapsto \frac{df}{dx}\delta x.$$ Right? To map from an infinitesimal change $\delta x$ to an infinitesimal change in $f$, you multiply by the derivative $\frac{df}{dx}$. In 1D the mapping is just a scalar multiplication---that's what the Leibniz notation is supposed to remind you of: the fact that the ordinary derivative represents a ratio(**) of infinitesimal changes. In higher dimensions, the linear mapping still exists, but it's no longer (generally) a scalar rescaling: it's an arbitrary linear function, called in various places the differential, push-forward, or (maddeningly) Jacobian.

Now the rabbit-hole truly begins. Let us write $d_{\mathbf{p}}f$ for the differential: this linear map from vectors in $\mathbb{R}^n$ representing infinitesimal changes in $\mathbf{p}$ to vectors in $\mathbb{R}^m$ representing infinitesimal changes in $f$. Let us focus on the case $m=1$ for now (scalar-valued functions). $d_{\mathbf{p}}f$ is a linear map from $n$-dimensional vectors to scalars. We know from linear algebra that any linear map can be written as matrix multiplication: if we arrange $\delta \mathbf{p}$ into a column vector, there must exist some matrix $M$ such that $$(d_{\mathbf{p}}f)(\delta\mathbf{p}) = M\delta\mathbf{p}.$$ In other words, evaluating the differential at $\delta\mathbf{p}$ is the same as multiplying the vector by a matrix. Since the differential (a function) and $M$ (a matrix representation of that function) are so intimately linked, very often the two are conflated and called the same and written using the same notation. (If you think this causes large amounts of confusion, you wouldn't be wrong.) Commonly this matrix is called the Jacobian matrix, so let's adopt that convention.

Now what is the dimension of $M$? It needs to multiply a column vector to yield a scalar... therefore it must be a row vector. And indeed, using the definition of the differential and directional derivative you can easily show that $$M = \left[\begin{array}{ccc}\frac{\partial f}{\partial x_1} & \cdots & \frac{\partial f}{\partial x_n}\end{array}\right].$$ This row vector is one (and possibly the most) natural candidate for "the" derivative $\frac{df}{d\mathbf{p}}.$

But wait! Surely the derivative of a scalar is "supposed to be" a column vector? This idea comes from an unextremely unfortunate conflation between the differential of a function, and a related but distinct concept called the gradient. Pick an inner product $\langle,\rangle$ in $\mathbb{R}^m$ (for example, the Euclidean dot product). Then for every linear function $g:\mathbb{R}^m\to\mathbb{R}$, there exists a unique vector $\mathbf{v}$ with $$g(\mathbf{w}) = \langle \mathbf{v},\mathbf{w}\rangle$$ for all vectors $\mathbf{w}$. In particular, the differential $d_{\mathbf{p}}f$ is such a linear function, and the corresponding vector is called the gradient and usually written $\nabla f(\mathbf{p})$. From the definition you immediately recover the property that taking an inner product with the gradient gives you the directional derivative of the function in that direction: $$\langle \nabla f(\mathbf{p}), \delta\mathbf{p}\rangle = d_{\mathbf{p}}f(\delta \mathbf{p}) = (D_{\delta \mathbf{p}}f)(\mathbf{p}).$$

What is the dimension of $\nabla f$? Well, it must be the same as $\delta\mathbf{p}$, and so if we are representing vectors on $\mathbb{R}^m$ by column vectors, it is a column vector. For the Euclidean dot product it is related to the Jacobian by $$\nabla f = M^T,$$ which you can easily check from the definitions.

The process of converting the differential (a linear function) to the gradient (a vector) and back is called tensor raising/lowering or applying the musical isomorphisms or taking the dual(***). Many sources, including published papers, carelessly conflate the two concepts, and take a transpose here and there "until the dimensions work out." This approach will work out OK often enough for scalar-valued functions in Euclidean space, but becomes a complete disaster when dealing with vector- or matrix-valued functions or with non-Euclidean inner products (as happens all the time in physics).

So, back to your question. We have a function $f(\mathbf{x}) = \mathbf{x}V$, where we have chosen to represent the domain and image of the function with row vectors. We have the differential $$d_{\mathbf{x}}f(\mathbf{\delta \mathbf{x}})$$ which linearly maps a row vector (infinitesimal change in $\mathbf{x}$) to another row vector (infinitesimal change in $f$.) Since $f$ is itself linear, this differential is easy to calculate: $$d_{\mathbf{x}}f(\mathbf{\delta \mathbf{x}}) = \delta \mathbf{x}V.$$ The differential, being linear, can be represented by matrix multiplication (this time on the right), and this matrix is $V$.

So the Jacobian matrix, as defined "most naturally" in this setting, is $V$, and the article is correct.

Probably.

(*) NB that neither the notation for the directional derivative, nor the convention on whether $\delta \mathbf{p}$ must be a unit vector, is fully standardized.

(**) Certain purists will howl that the derivative is "not really a ratio." They're both right and wrong, depending on how you interpret "the derivative" and "a ratio." Ignore their attempts to muddy the water for now.

(***) Achtung: there are other types of "dual" that this can easily be confused with.

Jacobian of right multiplication by a constant matrix.

There are 2 best solutions below

Related Questions in MULTIVARIABLE-CALCULUS

Related Questions in VECTORS

Related Questions in PARTIAL-DERIVATIVE

Related Questions in JACOBIAN

Trending Questions

Popular # Hahtags

Popular Questions