Correct way to evaluate partial derivatives for a coordinate transform?

Question

Correct way to evaluate partial derivatives for a coordinate transform?

1.9k Views Asked by Bumbble Comm At 27 Mar 2026 - 4:02

This question occurred to me when thinking about differential geometry change of coordinates. Consider the system of equations:

\begin{align*} p(x, y) = x + y &\qquad x(p, q) = p - q \\ q(x, y) = y &\qquad y(p, q) = q \end{align*}

Now, if I wish to evaluate $\frac{dp}{dq}$, there are two evaluation strategies available for me:

\begin{align*} \frac{dp}{dq} = \frac{d(x+y)}{dy} = \frac{dx}{dy} + \frac{dy}{dy} = 0 + 1 = 1 \qquad (1) \end{align*}

One the other hand, consider this evaluation:

\begin{align*} \frac{dp}{dq} = \frac{dp}{dx}\frac{dx}{dq} + \frac{dp}{dy}\frac{dy}{dq} = 1 \cdot -1 + 1 \cdot 1 = 0 \qquad (2) \end{align*}

Indeed, one can prove something much stronger: Let $p_i = f_i(x_1, x_2, \dots x_n)$. Now, the evaluation \begin{align*} \frac{dp_i}{dp_j} = \sum_{k=0}^n \frac{dp_i}{dx_k} \frac{dx_k}{dp_j} = \left[\frac{dp_i}{dx_0} \dots \frac{dp_i}{dx_k} \dots \frac{dp_i}{dx_n} \right] \cdot \left[ \frac{dx_0}{dp_j} \dots \frac{dx_k}{dp_j} \dots \frac{dx_n}{dp_j} \right]^T = (J \cdot J^{-1})_{ij} = \delta_{ij} \end{align*}

where $J$ is the jacobian of the function $\vec p = f(\vec x)$, and $\delta_{ij}$ is the kronecker delta.

I'm puzzled as to which interpretation I should choose. Interpretation (2) is nice if I want to think of the sets of "co-ordinate systems" $\{ p_i \}$ as being linearly independent, just like the original $\{ x_i \}$ are, but I have no idea if this is legal.

I'd love an answer that explains to me when (1) is legal, when (2) is legal, and maybe go into more detail about the relationship with the Jacboian, and related geometric insights!

Original Q&A

There are 3 best solutions below

Bumbble Comm On 31 Aug 2019 - 1:33

(1) is in a somewhat weird position because it is in effect using $\{x,q\}$ as the coordinate system and then treating $p$ as a function on $x$ and $q$. When you write $\frac{d(x+y)}{dy}=1$ (more typically written with a partial derivative $\frac{\partial(x+y)}{\partial y}=1$), you are computing this partial derivative keeping $x$ fixed.

(2) is the more typical way of interpreting $\frac{dp}{dq}$ (or $\frac{\partial p}{\partial q}$), because it is treating $p$ as a function of $p$ and $q$ (strictly, it's treating $p$ as a function of $x$ and $y$, which are themselves treated as functions of $p$ and $q$, but the actual function will not be affected). The partial derivative in $\frac{dx}{dq}$ or $\frac{dy}{dq}$ takes $p$ as fixed, and the derivative of $p$ when we $p$ to be fixed is of course 0.

One notation that I believe is more commonly used in physics may be helpful. We indicate the variable that is held fixed outside a pair of parentheses. In this case, (1) will be written as $$\left(\frac{\partial p}{\partial q}\right)_x$$ while (2) will be written as $$\left(\frac{\partial p}{\partial q}\right)_p$$

Bumbble Comm On 01 Sep 2019 - 1:17

Coordinates transform according to the rule of differentials via the Jacobian matrix, for example in two dimensions:

\begin{align} \epsilon & = \epsilon(x,y), \quad \Delta\epsilon = \frac{\partial\epsilon}{\partial x}\Delta x + \frac{\partial\epsilon}{\partial y} \Delta y \\ \eta & = \eta(x,y), \quad \Delta\eta = \frac{\partial\eta}{\partial x}\Delta x + \frac{\partial\eta}{\partial y} \Delta y \end{align}

$$ J = \begin{bmatrix} \frac{\partial\epsilon}{\partial x} & \frac{\partial\epsilon}{\partial y} \\ \frac{\partial\eta}{\partial x} & \frac{\partial\eta}{\partial y} \end{bmatrix} $$ Let $f\colon \mathbb R^2 \to \mathbb R^2$ denote a coordinate transformation such that $f[(x,y)] = (\epsilon(x,y), \eta(x,y))$. Consider $\epsilon(x,y) = x + y,\, \eta(x,y) = y$. Its Jacobian is:

$$ J = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix} $$

Now consider a scalar field $\phi\colon \mathbb R^2 \to \mathbb R$. It transforms under the coordinate transformation as $\phi(x,y) \stackrel{f} \mapsto \phi'(\epsilon,\eta)$. Its gradient $\nabla_{(x,y)}\phi(x,y)$ transforms to $\nabla_{(\epsilon,\eta)}\phi(x,y)$ according to the chain rule:

\begin{align*} \frac{\partial \phi}{\partial \epsilon} & = \frac{\partial \phi}{\partial x}\frac{\partial x}{\partial \epsilon} + \frac{\partial \phi}{\partial y}\frac{\partial y}{\partial \epsilon} \\ \frac{\partial \phi}{\partial \eta} & = \frac{\partial \phi}{\partial x}\frac{\partial x}{\partial \eta} + \frac{\partial \phi}{\partial y}\frac{\partial y}{\partial \eta} \end{align*}

Now let $\phi = \epsilon$ or $\phi = \eta$ and the computations follow along the line of $(2)$ in your question in the appropriate notation. For a geometric interpretation of the limiting behaviour of the change of coordinates with respect to their variables, this would also be equivalent to, for example, $\lim_\limits{\Delta \eta \to 0} \Delta\epsilon/\Delta\eta$ from the coordinate transformation rules shown in the beginning.

Note that you've treated $dp/dq = dp/dy$ in $(1)$, but $x$ and $y$ are independent while $x$ and $q$ are not (recall $x(x,y) = x,\, x(p,q) = p - q$), so $dx/dq \neq dx/dy$ and hence the treatment is incorrect because it doesn't follow the transformation rules.

**Bumbble Comm** · Accepted Answer

$(1)$ and $(2)$ do not make sense as written but I think we can make sense of the third calculation in the following, more general way: for convenience, work in $\mathbb R^2;\ $ the extension to $\mathbb R^n$ is obvious.

Fix $(x,y)\in \mathbb R^2,$ le $r^1,r^2$ be the standard coordinates on $\mathbb R^2$, so that $r^1:\mathbb R^2\to \mathbb R$ is projection on the first coordinate and $r^2:\mathbb R^2\to \mathbb R$ is projection on the second coordinate. (This is why, by the way, $\frac{\partial r^i}{\partial r^j}=\delta^i_j.)$

Let the diffeomorphism $\phi:\mathbb R^2\to \mathbb R^2$ be defined by $(p^1(x,y),p^2(x,y))$ Suppose $f:\mathbb R^2\to \mathbb R$ is smooth. We want to make sense of $\frac{\partial f}{\partial p^i}\Big|_p$ for $i=1,2.$ By itself, it has no intrinsic meaning. But $\frac{\partial (f\circ \phi^{-1})}{\partial r^i}\Big|_{\phi(p)}\ \textit{does}$ make sense: it is just the usual partial derivative of $f\circ \phi^{-1}$ at $\phi(p).$ So we $\textit{define}\ \frac{\partial f}{\partial p^i}\Big|_p$ to be $\frac{\partial (f\circ \phi^{-1})}{\partial r^i}\Big|_{\phi(p)}.$ This may seem silly, but it's only because we are working in a Euclidean space $\mathbb R^2$. If we were considering a more abstract space, this definition would in some sense be forced.

Notice now that $p^i=r^i\circ \phi,$ so taking $f=p^i:\mathbb R^2\to \mathbb R$ we have

$\frac{\partial p^i}{\partial p^j}\Big|_p=\frac{\partial (p^i\circ \phi^{-1})}{\partial r^j}\Big|_{\phi(p)}=\frac{\partial (r^i\circ \phi \circ \phi^{-1})}{\partial r^j}\Big|_{\phi(p)}=\frac{\partial r^i}{\partial r^j}\Big|_{\phi(p)}=\delta^i_j.$

Now, your $\phi(x,y)=(x+y,y)$ is a diffeomorphism, so the above applies to it.

Final remark: it is precisely because of our definition of $\frac{\partial f}{\partial p^i}$ that we can manipulate these symbols "in the usual way.". Of course, $\phi^{-1}$ is also a diffeomorphism because $J(\phi)\neq 0$, so your chain rule calculation amounts to showing that the the Jacobian of the identity map is the identity matrix!

Correct way to evaluate partial derivatives for a coordinate transform?

There are 3 best solutions below

Related Questions in MULTIVARIABLE-CALCULUS

Related Questions in DIFFERENTIAL-GEOMETRY

Related Questions in COORDINATE-SYSTEMS

Trending Questions

Popular # Hahtags

Popular Questions