Question about the gradient in a coordinate system.

Question

Question about the gradient in a coordinate system.

138 Views Asked by Bumbble Comm At 31 Mar 2026 - 9:48

Let $(x,y)$ a coordinate system and $f:\mathbb R^2\to \mathbb R$ a derivable function. Then $$\nabla f(x,y)=\left(\partial _xf(x,y),\partial _yf(x,y)\right).$$

If I'm in $(u,v)$, then $$\nabla f(u,v)=\left(\partial _u(u,v),\partial _v(u,v)\right).$$

So, when I'm in $(r,\theta )$ why the gradient is different ? i.e. why is it $$\nabla f(r,\theta )=\left(r\cos\theta \partial_rf-\sin\theta \partial _\theta f, r\sin\theta \partial _r+\cos\theta \partial _\theta f\right)$$

and not $(\partial _rf,\partial _\theta f)$ as previously ? This look misterious for me.

Original Q&A

There are 3 best solutions below

Bumbble Comm On 26 Dec 2019 - 1:35

$x=rcos\theta, y=rsin\theta$. $f(x,y)=f(rcos\theta,rsin\theta)=f\circ A$ where $A(r,\theta)=(rcos\theta,r\sin\theta)$, so $d(f\circ A)=df\circ Jac(A)$

$Jac(A)=\pmatrix{cos\theta & sin\theta\cr -rsin\theta &rcos\theta}$

Bumbble Comm On 16 Aug 2021 - 6:34

I think there's another assumption lurking in the background here. In the Cartesian coordinate system, if we have a vector $<a,b>$ with unit vectors $e_x$ and $e_y$, then we can represent the point as $p = ae_x + be_y$. This is not true in general, and specifically is not true for polar coordinates.

In polar coordinates, the unit vectors actually vary based on the position of each point. The unit vectors in polar coordinates are $e_r$, pointing one unit along the vector of the point $(r,\theta)$, and $e_{theta}$, pointing orthogonal to $e_r$.

To see where these come from, take $p(x,y) = f(x)e_x + g(y)e_y$, then let's re-express it in polar. So we get $p(r,\theta) = p(rcos(\theta),rsin(\theta)) = f(rcos(\theta))e_x + g(rsin(\theta))e_y$.

Now, to get the unit vectors we define each of them as $e_r = \frac{\frac{dp}{dr}}{|\frac{dp}{dr}|}$, and $e_{theta} = \frac{\frac{dp}{d\theta}}{|\frac{dp}{d\theta}|}$, and we can do some algebra to see that those vectors work out to the description given above.

So we can see that when expressing a point in polar coordinates, we actually don't use $e_{\theta}$, we can just express it as $p(r,\theta) = ru_r(r,\theta)$.

**Bumbble Comm** · Accepted Answer

There is a common confusion with how functions are described.

Function on 2 different variables can be thought in 2 different ways. One, as a function taking on 2 input. Two, as a function taking on a point in a plane. Usually, these 2 concepts are conflated. But for the purpose of geometry and calculus, function should be thought as taking on a point on the plane. Therefore, when you see a definition of a function, you should understand it as implicitly defining a function on a plane instead. This implicit assumption depends on context, for example $f(x,y)$ should be implicitly understood as a the value of $f$ at the point with Cartesian coordinate $(x,y)$ but $f(r,\theta)$ would be implicitly understood to be the value of $f$ on the point with polar coordinate $(r,\theta)$.

Which is why $f(r,\theta)=f(r\cos\theta,r\sin\theta)=f(x,y)$, something that look nonsensical at first, but it's not. Here $f(r,\theta)$ is the value of $f$ at the point described by polar coordinate $(r,\theta)$, $f(r\cos\theta,r\sin\theta)$ is the value of $f$ at Cartesian coordinate $(r\cos\theta,r\sin\theta)$, and $f(x,y)$ is the value of $f$ at Cartesian coordinate $(x,y)$ where numerically $x=r\cos\theta,y=r\sin\theta$.

So remember this, when it comes down to it, these functions are defined on points on the plane, and $f(x,y)$ and $f(r,\theta)$ are just different way we describe the function using plane coordinate, because it's impossible to specify a point without a coordinate.

Next, the issue is whether you are using Cartesian coordinate or polar coordinate to describe the gradient. It looks like you used Cartesian, which result in a mess.

Now, a gradient is a vector field $u$, such that for point $p$ and any vector $v$ then $D_{v}f(p)=u(p).v$ where the . is the geometric dot product. I specifically say "geometric dot product" to refer to the definition of dot product as "product of length times cosine of angle in between".

You can describe vector $v$ in term of coordinate, and the result of directional derivative should be the same as expected, whether you're in polar coordinate or Cartesian. But what change is actually the dot product.

In Cartesian coordinate, the dot product is very simple: (a,b).(c,d)=ac+bd. The geometric dot product can be computed easily from coordinate just like that. Hence the simple form of the gradient. You don't even need to care where these vector are based at. As a result, gradient is very simple.

But in polar coordinate, the dot product is distorted in term of coordinate. The reason is because when you change your $\theta$ at the rate of $1$ you don't move at the rate of $1$ anymore, but the rate of $r$ where $r$ is the distance to the origin from the point these vectors are based at. Hence there is a factor that is needed to account for this distortion. So the formula for polar coordinate dot product is less nice, it's $(a,\alpha).(b,\beta)=ab+\frac{\alpha}{r}\frac{\beta}{r}$. Notice the need for the extra factor of $\frac{1}{r}$ to account for length distortion. As a result, gradient is messier, you get the gradient to be $\frac{\partial f}{\partial r}R+\frac{1}{r}\frac{\partial f}{\partial\theta}\Theta$ (where $R,\Theta$ are unit vector for polar coordinate). Note that from this formula immediately tell you your coordinate of the gradient in term of polar coordinate, but not Cartesian coordinate. To get back Cartesian coordinate, you need to account for your rotation, which is how you get that mess.

Question about the gradient in a coordinate system.

There are 3 best solutions below

Related Questions in MANIFOLDS

Related Questions in POLAR-COORDINATES

Trending Questions

Popular # Hahtags

Popular Questions