Why is the derivative of a function at a point is a linear functional on the tangent space?

518 Views Asked by At

My apologies if this question has been asked before but I haven't been able to find a satisfying answer.

Whenever I look into the definition of tangent spaces, it's always in the context of manifolds or differential geometry which are two topics I do not know about a lot. The reason I am asking about it is because I have seen in some definition, that the derivative of a function at a point $p$, let's say $f: \mathbb{R}^n \rightarrow \mathbb{R}$ is actually a linear functional acting on the tangent space of $\mathbb{R}^n$ at that point. I find this definition very interesting but I am not sure I am grasping the intuition geometrically because I do not think I understand what the tangent space represents. If I try to visualize it, how would it relate to the tangent plane of a surface at a point in this?

2

There are 2 best solutions below

0
On BEST ANSWER

It might be easier to see if we put this in the context of single variable calculus.

In calculus, given a functions $f:\mathbb R\to\mathbb R$, we can take the derivative at a single point $p\in\mathbb R$. This gives us the line $$f'(p)=\frac{y-f(p)}{x-p}\implies y-f(p)=f'(p)(x-p)$$ as the tangent to the curve at the point $(p,f(p))$.

Let $dx=x-p$ and let $dy=y-f(p)$. Then the tangent line becomes $$dy=f'(p)dx$$ and this is the map of the tangent space of $\mathbb R$ at $x=p$ to the tangent space of $\mathbb R$ at $y=f(p)$.

The elements of the tangent space at $x=p$ are the $dx$s and are the changes we can make in any direction in $\mathbb R$. $\mathbb R$ is one dimensional so there is only one direction of change and that is along the $x$ axis.

The same analysis can be done with a function $g:\mathbb R^n\to\mathbb R$. Consider a point $q=(q_1,\dots,q_n)\in\mathbb R^n$.

We have $y=g(x_1,\dots\,x_n)$ and $$dy=\frac{\partial y}{\partial x_1}(q_1,\dots,q_n)dx_1+\dots +\frac{\partial y}{\partial x_n}(q_1,\dots,q_n)dx_n=\nabla g(q_1,\dots,q_n)\cdot d\vec x=\nabla g(q_1,\dots,q_n)\begin{pmatrix}dx_1\\ \vdots\\ dx_n\end{pmatrix}$$.

The derivative is represented by the gradient which is a linear functional from the tangent space of $\mathbb R^n$ at $q=(q_1,\dots,q_n)$ to the tangent space of $\mathbb R$ at $y=g(q_1,\dots,q_n)$. Note that $dx_i$ is the change along the $x_i$ axis in $\mathbb R^n$.

1
On

Geometrically, this arises when you want to zoom in on either the function itself or its graph. The formulas are simplest if you assume that you want to zoom in on a neighborhood of $0$ in the domain and that $f(0) = 0$. Given a small $\delta > 0$, we magnify the ball of radius $\delta$ into a ball of radius $1$ and also magnify the value of $f$ by the same ratio. This is equivalent to magnifying the graph near $0$ by the ratio $\delta^{-1}$. The magnified function is $$ f_\delta(x) = \delta^{-1}f(\delta x). $$ If $f$ is smooth,it's clear intuitively that, as $\delta \rightarrow 0$, the graph converges to a plane through the origin. Equivalently, $f_\delta$ converges to a linear function.

This can be made rigorous fairly easily as follows: Given $x_0$ (assumed to be $0$ above), by the fundamental theorem of calculus, \begin{align*} f(x) - f(x_0) &= \int_{t=0}^{t=1}\frac{d}{dt}f(x_0 + t(x-x_0))\,dt\\&= (x-x_0)\cdot \int_{t=0}^{t=1} \nabla f(x_0+t(x-x_0))\,dt\\ &= x\cdot V(x). \end{align*} It follows that if $x_0 = 0$ and $f(x_0) = 0$, then \begin{align*} f_\delta(x) &= x\cdot V(\delta x). \end{align*} If we assume, for example, that $\nabla f$ is continuous at zero, then \begin{align*} \lim_{\delta\rightarrow 0} f_\delta(x) &= x\cdot (\lim_{\delta\rightarrow 0}V(\delta x))\\ &= x\cdot \nabla f(x_0), \end{align*} which is a linear function.

It turns out that if we assume only that $\nabla f(0)$ exists, that $f_\delta$ does not necessarily converge to a linear function. For many purposes, especially in functional analysis, we do need the derivative of $f$ to be linear, so it is imposed as an assumption. This is particularly important if the domain of $f$ is infinite dimensional. See, for example, the Wikipedia article on the Fréchet derivative.

A natural question is whether there is a way to approximate $f$ using a quadratic equation, so that its graph becomes a parabola, hyperbola, or their higher dimensional analogues. Naively, you want to take the limit of something like $$ g_\delta(x) = \delta^{-2}f(\delta x). $$ This can be done, but you have to get rid of the linear term first. Otherwise, this function blows up as $\delta \rightarrow 0$. Geometrically, what you do is to rotate the graph until its tangent plane at the origin is horizontal. Equivalently, $\nabla f(0) = 0$. Using a similar argument as above, if the second derivatives of $f$ are continuous at $0$, then you can write $$ f(x) = x^ix^jH_{ij}(x), $$ where $H_{ij}(0) = \partial^2_{ij}f(0)$. Now, $$ \lim_{\delta\rightarrow 0} \delta^{-2}f(\delta x) = x^ix^jH_{ij}(0), $$ which is a quadratic polynomial and therefore has a graph like a parabola or hyperbola. In fact, the symmetric matrix $H_{ij}(0)$ (calculated after you rotated the graph) is a geometric invariant of the graph known as the second fundamental form.

There's even more to the story. If you don't assume that $\nabla f(0)$ exists, then the limiting function $$ \lim_{\delta\rightarrow 0} \delta^{-1}f(\delta x) $$ might still exist but its graph is not a plane. You can show that if the limit does exist, the graph will be a cone, which is called the tangent cone. The simplest example is the absolute value function or, in higher dimensions, $f(x) = |x|$. This situation is also interesting and arises in certain areas of differential geometry, such as the study of minimal hyper surfaces.