Why is the definition of derivative what it is?

318 Views Asked by At

In our lectures, we've been taught the following:

We say that $f:\mathbb{R}^3\to\mathbb{R}$ is differentiable at a point $X$,iff there exists $\alpha\in\mathbb{R}^3$ such that $$\epsilon (H)=\frac{f(X+H)-f(X)-\alpha\cdot H}{\|H\|}\to0$$ as $\|H\|\to0$ and the derivative is $\alpha$

But I can't understand why this should work? What is the intuition behind setting up $\epsilon(H)$ like this? Why the dot product ($\alpha\cdot H$)? What does $\alpha$ represent physically on the curve?

Please help, thanks.

2

There are 2 best solutions below

0
On

Say $\mathbf{v}$ is a unit vector and $f:\mathbb{R}^n\to\mathbb{R}$ a scalar function.

The directional derivative of $f$ at $\mathbf{x}$ in the direction of $\mathbf{v}$ is

$$ D_{\mathbf{v}}f(\mathbf{x}) = \lim_{h\to0} \frac{f(\mathbf{x}+h\mathbf{v})-f(\mathbf{x})}{h}. $$

If we interpret $f(\mathbf{x}+h\mathbf{v})$ as a function of $h$ with $\mathbf{x},\mathbf{v}$ fixed, this is

$$\begin{array}{l} \displaystyle \frac{\mathrm{d}}{\mathrm{d}h}f(\mathbf{x}+h\mathbf{v}) &= \displaystyle\frac{\partial f}{\partial x_1}\frac{\partial (x_1+hv_1)}{\partial h}+\cdots+\frac{\partial f}{\partial x_n}\frac{\partial(x_n+hv_n)}{\partial h} \\[5pt] & \displaystyle = \frac{\partial f}{\partial x_1}v_1+\cdots+\frac{\partial f}{\partial x_n}v_n \end{array} $$

at $h=0$ (so all the partials $\partial f/\partial x_i$ are evaluated at $\mathbf{x}$) by the multivariable chain rule.

This is just the dot product $D_{\mathbf{v}}f(\mathbf{x})=\nabla f(\mathbf{x})\cdot \mathbf{v}$ where $\nabla f$ is the gradient.

Rearranging, this may be written as

$$ \frac{f(\mathbf{x}+h\mathbf{v})-f(\mathbf{x})-\nabla f(\mathbf{x})\cdot(h\mathbf{v})}{h}\to0 \quad \textrm{as }h\to0. $$

With the substitution $\mathbf{h}=h\mathbf{v}$ this becomes

$$\frac{f(\mathbf{x}+\mathbf{h})-f(\mathbf{x})-\nabla f(\mathbf{x})\cdot\mathbf{h}}{\|\mathbf{h}\|}\to0 \quad \textrm{as }\|\mathbf{h}\|\to0. $$

The derivative of $f$ at $\mathbf{x}$ in this case is a vector $\nabla f(\mathbf{x})\in\mathbb{R}^n$ depending on $\mathbf{x}$.

More generally one can do the same thing to vector functions $f:\mathbb{R}^n\to\mathbb{R}^m$, in which case a linear function $Df:\mathbb{R}^n\to\mathbb{R}^m$ will be applied to $\mathbf{h}$ instead of a dot product with a vector. (This is a generalization since any linear function $\mathbb{R}^n\to\mathbb{R}$ is just a dot product with some vector.)

0
On

As stated in my comment, $\alpha$ is a linear transformation that approximates the function $f$ well in a small region. I can only show you why the definition makes sense, but not tell the physical meaning.

You can find the following arguments in Spivak's Calculus on Manifolds.

In single variable case, we define the derivative at a point $x$ by $$L:=\lim\limits_{h\to0}\frac{f(x+h)-f(x)}{h}$$ if such number $L$ exists. The number $L$ is called the derivative of $f$ at $x$. Equivalently, we have $$\lim\limits_{h\to0}\frac{f(x+h)-f(x)}{h}-L=0,$$ or $$\lim\limits_{h\to0}\frac{f(x+h)-f(x)-Lh}{h}=0.$$ To generalise this expression to several variables, we put the norm sign in both the numerator and the denominator (so that division makes sense), and require that $L$ is a linear transformation. The function $f$ is differentiable at $x$ if there exists a linear transformation $L$ such that $$\lim\limits_{h\to0}\frac{\lVert f(x+h)-f(x)-Lh\rVert}{\lVert h\rVert}=0.$$ The function $L$ is then the derivative of $f$ at $x$.