Given a function $f$ on $\mathbb{R}^n$, the gradient is defined to be $\left(\frac{\partial f}{\partial x_1},\dots,\frac{\partial f}{\partial x_n}\right)$. But here $x_1,\dots, x_n$ are assumed to be the standard coordinates. If we change the coordinates then the gradient changes. I believe the same thing can be said about the gradient on a manifold. If someone gives Alice some function $f$ and some coordinate system, and gives Bob the same function $f$ with some other coordinate system, will they give two different answers to what the gradient is? Doesn't this mean the gradient is not well defined?
What is the gradient of a function?
982 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 3 best solutions below
On
Arguing like this, you could say that a vector (in general) doesn't exists because in every basis it has a different expressions. But that's the very idea of vectors and vector spaces! Yes, one vector can have different coordinate expressions in different bases but it's still the same object. What is important is that the coordinate expression should reasonably transform when you change the basis.
With gradient vector fields, it is a bit more complex: it is defined (as a vector) only if you have a metric on your manifold. It is defined as the dual (wrt. metric) of the differential $df$, which is a one-form. This differential is invariantly defined, that is, in local coordinates $x^i$ resp. $y^i$ it equals $\sum_j \frac{\partial f}{\partial x^j} dx_j=\sum_j \frac{\partial {f}}{\partial y^j} dy_j$, where $dy$ and $dx$ are related by $dy_j=\sum_{k} \frac{\partial y^j}{\partial x^k} dx_k$. This always makes sense, even without a metric: the differential assigns to a vector $v$ the partial derivative of $f$ wrt. $v$-direction. But it is not a vector field but rather a covector field. If you considered just the $n$-tuple $(\frac{\partial f}{\partial x_i})_i$, then you can easily derive that under a change of basis, this $n$-tuple would change as a covector.
If you have a metric $g_{ij}$ (in $x$-coordinates) resp. $\tilde{g}_{ij}$ (in $y$-coordinates), then the gradient vector can be expressed as $$ (\sum_j \frac{\partial f}{\partial x^j} g^{ij})_i $$ resp. $$ (\sum_j \frac{\partial f}{\partial y^j} \tilde{g}^{ij})_i. $$ These $n$-tuples are just expressions of the same vector in a different bases, where the transition matrix is given by $\frac{\partial x^i}{\partial y^j}$ or vice versa.
On
When $f:\>{\mathbb R}^n\to{\mathbb R}$ is differentiable at the point $a$ then $$f(a+X)-f(a)=\phi(X)+o\bigl(|X|\bigr)\qquad(X\to0)\tag{1}$$ for a certain linear function (a functional) $\phi:\>{\mathbb R}^n\to{\mathbb R}$. This linear function is called the differential of $f$ at $a$, and is denoted by $df(a)$. Therefore we can replace $(1)$ by $$f(a+X)-f(a)=df(a).X+o\bigl(|X|\bigr)\qquad(X\to0)\ .\tag{2}$$ When a scalar product $\cdot$ is defined in ${\mathbb R}^n$ then the functional $df(a)$ can be represented by a vector $A\in{\mathbb R}^n$ as follows: $$df(a).X=A\cdot X\qquad \forall X\in{\mathbb R}^n\ .$$ This vector $A$ is called the gradient of $f$ at $a$, and is denoted by $\nabla f(a)$. We therefore can replace $(2)$ by $$f(a+X)-f(a)=\nabla f(a)\cdot X+o\bigl(|X|\bigr)\qquad(X\to0)\ .$$ Note that so far we have not talked about coordinates at all. But if coordinates are adopted we'd like to know how the coordinates of $\nabla f(a)$ are computed. Here one can say the following: If the coordinates $(x_1,\ldots, x_n)$ refer to an orthonormal coordinate system then $$\nabla f(a)=\left({\partial f\over\partial x_1},\ldots,{\partial f\over\partial x_n}\right)_a\ .$$
Because they are using different coordinates, Alice and Bob will not get the same components for the gradient. They will, however agree on the norms of the gradient, and if you give Alice the coordinate transform from Bob's coordinates to hers, then if she applies the pullback to her gradient, she will get Bob's components.
This whole component approach to the gradient is quite misleading imo, I believe it is more useful to think about it in the following way:
Let $(V,g)$ be a real, finite-dimensional inner product space (but if $V$ is an infinite-dimensional Banach-space, then similiar considerations will apply), and let $f:V\rightarrow\mathbb{R}$ be a scalar function.
Then the Fréchet derivative of $f$ at $x_0\in V$ is the $df|_{x_0}:V\rightarrow\mathbb{R}$ linear functional that will satisfy $$\lim_{h\rightarrow0}\frac{\|f(x_0+h)-f(x)-df|_{x_0}(h)\|}{\|h\|}=0,$$ eg. it is a local linearization of $f$. Let $h=tv$, where $v\in V$ and $t$ is a scalar. Then it is easy to check that if $f$ is Fréchet-differentiable, then $$df|_{x_0}(v)=\lim_{t\rightarrow0}\frac{f(x_0+tv)-f(x_0)}{t}, $$ which is exactly the formula for directional derivatives.
The linear functional $df|_{x_0}$ is obviously independent of coordinates, since we haven't used any in the definition, and is obviously an element of the dual space $V^*$. We also got as a result, that iff $f$ is Fréchet differentiable, then the directional derivative of $f$ in the direction of $v$ is exactly the Fréchet derivative applied to $v$.
We know that to take the components of a linear functional in a basis $E=(e_1,...,e_n)$, we need to apply the linear functional to all the $e_i$'s. $$ df|_{x_0}(e_i)=\lim\frac{f(x_0+te_i)-f(x_0)}{t}=\left.\frac{\partial f}{\partial x^i}\right|_{x_0}, $$ where $x^i$ are the linear coordinates associated with the basis $E$. Since we took this point-wise, this definition also works for non-linear coordinates, then $E$ is the coordinate-basis associated with the coordinate system. So we got that the components of the Fréchet-derivative are the partial derivatives along the coordinates.
Since we have a $g$ inner product, we can take the $V^*\rightarrow V$ isomorphism, by taking $$ g^{-1}(\bullet ,df|_{x_0})=\nabla f|_{x_0} .$$ The $$x_0\mapsto \nabla f|_{x_0} $$ map then defines a vector field on $V$, which we call the gradient of $f$.
As you can see, the gradient is perfectly well defined without coordinates. What Alice and Bob will not agree on, is the coordinate expression of their gradients.