If a function $f(x,y)$ would output a value in a third dimension, $z=f(x,y)$ for example. How can we treat the gradient of $f$ as a function in $x$ and $y$, when the output of the gradient is a vector in the two dimensions $x$ and $y$ which are the dimensions of the inputs. I guess my question is, is it normal for a function to map to the dimensions of its inputs?
How can a gradient be thought of as a function?
1.4k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 5 best solutions below
On
A function $F$ can be thought of as a machine that takes in inputs from one set, say $X$, and outputs elements in another set, say $Y$. A function outputs a single element for every input, and we write it $F : X \rightarrow Y$.
In your case, the gradient of $f$ is just a function $\nabla f : \mathbb{R}^2 \rightarrow \mathbb{R}^2$, where for each input $(x,y)$, it outputs the element $\left( \frac{\partial f}{\partial x}(x,y), \frac{\partial f}{\partial y}(x,y) \right)$.
On
$(1)$ The gradient is a vector-valued function, this maps pairs of numbers $(x,y) \in \mathbb R^2$ to some other pair of numbers $(x',y')\in \mathbb R^2$. These pairs of numbers we call vectors and they have a very geometric interpretation: they have a length and a direction. Specifically the gradient corresponds to the direction and magnitude of steepest ascent.
Also see:
https://en.wikipedia.org/wiki/Vector-valued_function
This Khan Academy link I found as well is very useful as he also thought of the same example as I did: https://www.khanacademy.org/math/multivariable-calculus/multivariable-derivatives/gradient-and-directional-derivatives/v/gradient-and-graphs
$(2)$ On the contrary, when you plot a function that maps from $\mathbb{R^2} \rightarrow \mathbb{R}$ like $f(x,y)=x^2 +y^2$, when you want to plot this you often define a third variable $z=f(x,y)$ and you let the value of this variable be equal to the function value. I have plotted $x^2 +y^2$ in this way below:

Without introducing another axis, we can also just give different function value ranges a different colour, so a large value could be very dark and a low value could be very light or vice versa. We recognise the same function:
The difference between $(1)$ and $(2)$ is the notion of direction. The function you describe is usually a "scalar field", we only have the notion of magnitude or "value" but not of direction. Gradients will give you a so-called "vector field" as physicists often call it, we usually visualise this using VectorPlots. Below you find such a method, I've plotted the vector field $f(x,y)=(x^2+y,y^2+x )$
which has gradient $grad(f)(x,y)=(2x,2y)$
On
The gradient operator is a higher order function: it maps functions to functions. In case of scalar fields on real vector spaces, $$ \nabla :(\mathbb{R}^n\to\mathbb{R}) \to (\mathbb{R}^n\to\mathbb{R}^n). $$ Thus, if $F:\mathbb{R}^2\to\mathbb{R}$, then $\nabla F : \mathbb{R}^2 \to \mathbb{R}^2$, and if you evaluate that at some point, you get a single vector, like $\nabla F(x,y) : \mathbb{R}^2$. Specifically, $$ \nabla F(x,y) = \begin{pmatrix}\frac{\partial F(x,y)}{\partial x} \\ \frac{\partial F(x,y)}{\partial y}\end{pmatrix}. $$ See also What does the symbol nabla indicate?
On
Tensors play an important role here. Examples include the usual vectors in three dimensional space. Matrices.
A tensor in a general sense is a function that takes vectors or one-forms and returns a real number. Further, the tensor is a linear function of these inputs. As with any other function, you can take the derivative of a tensor.
For example, suppose you have a rotation matrix that rotates point in the xy plane about the z axis. The matrix is a tensor. It's input is a vector. Linear changes in the input result in linear changes in the output.
In the case of a rotation matrix, there are 2 inputs in effect, how much to rotate by, and the initial position. So you can think of the final position as a vector function of x,y, and $\theta$, and you can consider derivatives with respect to those.
Consider $f(x,y,z)=ax^2+by^2+cz^2=1$, with $a=b=c=1$. What does $\nabla f$ look like? It's essentially double the position vector pointing from the origin to the point. Any infinitesimal displacement tangent to the surface will be perpendicular to the gradient.
$df=\nabla f \cdot d\vec{s}$
If $|df|>0$, then you are necessarily leaving the surface.
As you change $a,b,c$, you get a different surface and a different gradient. Always perpendicular to the surface. This is another sense in which it makes sense to consider a derivative of a gradient.



Generally that is not how a function of multiple variables works. If you happen to have a function that takes two real numbers as input and produces a single real number as its output, then you can plot the function in three dimensions, using two dimensions for the input and one for the output. To consider this as the definition of a function, however, or even to consider it as a "typical" function, is a mistake.
Yes, it is "normal" in the sense that you will often encounter perfectly good functions with that property. There are also perfectly good functions that map to fewer dimensions than their input, and perfectly good functions that map to more dimensions than their input.
As an example of the "more dimensions" case, consider the position of a particle in space as a function of time: one input dimension, three output dimensions.