A puzzling "failure" of the Chain Rule

717 Views Asked by At

Consider the standard transformation equations between Cartesian and polar coordinates:

\begin{align*} x&=r \cos \theta\\ y&=r \sin \theta \end{align*}

and the inverse: $r=\sqrt{x^2+y^2}, \theta=\arctan\frac{y}{x}$.

Now consider the following product of derivatives: ${\displaystyle f=\frac{\partial r(x,y)}{\partial y}\frac{\partial y(r,\theta)}{\partial r}}.$ By the chain rule ${\displaystyle f=\frac{\partial r(x,y)}{\partial y}\frac{\partial y(r,\theta)}{\partial r} = \frac{\partial r}{\partial r} = 1}.$ However, if we calculate each multiplicand in isolation, then transform the mixed-coordinate result into a single coordinate system, we get:

\begin{align*} \frac{\partial r(x,y)}{\partial y}& =\frac{y}{\sqrt{x^2+y^2}}=\sin\theta\\ \frac{\partial y(r,\theta)}{\partial r}& = \sin\theta \end{align*}

and therefore, ${\displaystyle f=\frac{\partial r(x,y)}{\partial y}\frac{\partial y(r,\theta)}{\partial r} = \sin^2\theta}$

But we've shown by the chain rule that $f=1$!

:giantfireball:

I must be abusing the chain rule in some way (in the original context in which I stumbled on this, the correct result is $\sin^2\theta$), but I can't see what I did wrong. What's going on?

3

There are 3 best solutions below

1
On BEST ANSWER

Thanks to @InterstellarProbe for the memory jog.

Recall the multivariable chain rule:

$\displaystyle \frac{\partial u(r,\theta)}{\partial r}=\frac{\partial u(x(r,\theta),y(r,\theta))}{\partial r} = \frac{\partial u}{\partial x}\frac{\partial x}{\partial r}+\frac{\partial u}{\partial r}\frac{\partial r}{\partial y}$

Clearly,

$\displaystyle \frac{\partial u(x(r,\theta),y(r,\theta))}{\partial r} \ne \frac{\partial u}{\partial r}\frac{\partial r}{\partial y}$

which is what the OP assumes, for the case $u=r$. You could calculate each term seperately as in OP, or, just to show the multivariable chain rule hasn't failed us yet ( using the chain rule identity above with $u=r$):

${\displaystyle f=\frac{\partial r(x,y)}{\partial y}\frac{\partial y(r,\theta)}{\partial r} = \frac{\partial r}{\partial r} - \frac{\partial r}{\partial r}\frac{\partial r}{\partial y} = 1 - \cos(\theta)^2 = \sin(\theta)^2 }$

Which agrees with result in OP.

In summary, with multivariate functions you cannot simply "cancel" differentials. Any muscle memory from single variable calculus needs to be corrected.

Update: There is a case where partial derivatives can be cancelled, though it's a matter of notation. If we use Einstein Summation, then it is true that:

$$ \frac{\partial u(x^1,x^2,\ldots,x^n)}{\partial \bar{x}^i}\frac{\partial \bar{x}^i}{\partial x^j}=\frac{\partial u}{\partial x^j} $$ Assuming $x^i=f_i(\bar{x}^1,\bar{x}^2,\ldots,\bar{x}^n)$.

Of course this works only because the repeated $i$ index on the LHS indicates a summation, so that the LHS is actually the correct application of the chain rule for the partial derivative on the RHS.

1
On

This is what Penrose calls "the second fundamental confusion of calculus". The problem is that with partial derivatives, what is held constant is just as important as what is varied. A simple example is if we change coordinates by $$ x = u+v \\ y = v, $$ or $$ u = x-y \\ v = y $$ Then $$ \frac{\partial y}{\partial u} = 0, \quad \text{but} \quad \frac{\partial u}{\partial y} = -1, $$ which is quite different from in one dimension. Even weirder, $$ \frac{\partial u}{\partial v} = 0, \quad \text{ but } \quad \frac{\partial u}{\partial y} = -1, $$ even though $v=y$!

The reason is that in the first expression, $u$ is implicitly held constant, whereas in the second, $x$ is held constant, and although $u=y$, clearly $u \neq x$. This is why partial derivatives are such a nuisance compared to differentials: a partial derivative is moving along a line, and a line requires specifying everything that is constant along it, whereas differentials act more like hyperplanes, which only need a normal to be specified.

0
On

One of my favorites in the hall of shame of differential cancellation: Consider the plane $x+y+z=0.$ At every point on this plane, each of the variables can be solved explicitly(!) as a function of the other two. We easily find

$$\frac{\partial x}{\partial y}=\frac{\partial y}{\partial z} = \frac{\partial z}{\partial x}=-1.$$

Now consider

$$\frac{\partial x}{\partial y}\cdot\frac{\partial y}{\partial z} \cdot \frac{\partial z}{\partial x}$$

This is a merry canceler's delight: All differentials cancel and so the product is $1$! But as we've seen, each factor is $-1,$ so the product is actually $-1.$