How to rigorously define the parametric derivative?

Question

How to rigorously define the parametric derivative?

410 Views Asked by Bumbble Comm At 07 Apr 2026 - 9:38

I'm trying to understand the parametric derivative identity $$ \frac{dy(t)}{dt} = \frac{dy}{dx} \frac{dx}{dt} \tag{1}$$

I feel this is not rigorous because we are say that $y(t)$ can be written as $y(x(t) )$ what conditions are required on $y$ for this to be true?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 13 Mar 2022 - 2:56

The definition of derivative as the limit for $Δx$ tending to $0$ ,reads:

$\lim_{ Δx \to 0} \frac{ Δy }{ Δx }=\frac{dy}{dx}$,

It can also be written as:

$ \frac{ Δy }{ Δx }=\frac{dy}{dx}+\epsilon$,

where

$\lim_{ Δx \to 0}\epsilon=0$.

Now multiplying both members by $Δx$:

$ Δy= \frac{dy}{dx} Δx +\epsilon Δx $,

we express $Δx$ as a function of $t$, having also

$ Δx= \frac{dx}{dt} Δt +\epsilon’ Δt $,

therefore

$ Δy= \frac{dy}{dx}\Big(\frac{dx}{dt} Δt +\epsilon’ Δt\Big)+\epsilon \Big(\frac{dx}{dt} Δt +\epsilon’ Δt\Big)=$,

$ Δy= \frac{dy}{dx}\frac{dx}{dt} Δt +\epsilon’\frac{dy}{dx} Δt+\epsilon \frac{dx}{dt} Δt+\epsilon \epsilon’ Δt$,

$\frac{ Δy }{ Δt }=\frac{dy}{dx}\frac{dx}{dt} +\epsilon’\frac{dy}{dx}+\epsilon\frac{dx}{dt}+\epsilon\epsilon’$;

calculating the limit for $Δt$ tending to $0$, also $Δx$ tends to $0$, neglecting the infinitesimals of higher order, we get:

$\lim_{ Δt\to 0}=\frac{ Δy }{ Δt }=\frac{dy}{dx}\frac{dx}{dt}$.

**Bumbble Comm** · Accepted Answer

For me getting a better handle on derivatives only happened after looking into the more general notions related to differentiation, where the meaning of d becomes more clear. Keywords to consider are "total derivative", "Frechet derivative", "differential of a function". I find that there is some overlap between all these and the naming is somewhat fluid, but in general what you get is a definition of the differential as a linear linear approximation of the function around a point, which is really just a generalization of the (one dimensional) derivative on $\mathbb R$.

For a given function $f:X\to Y$ ($X$ and $Y$ can be arbitrary normed vector spaces at their most general but the same will work also for $\mathbb R^n$ and $\mathbb R$) its differential at a point $x\in X$ is defined as the (unique when it exists) linear operator $L$ (think matrix multiplication for finite dimension vector spaces and multiplication with a constant for $\mathbb R$) so that the value of $f$ around $x$ can be approximated as $$ f(x + h) = f(x) + L(h) + \epsilon(h) $$ and the error $\epsilon(h)$ goes to zero faster than the norm of $h$, which translates to a limit condition $$ \lim_{\|h\|_X\to 0} \frac{\|\epsilon(h)\|_Y}{\|h\|_X}=0 $$ which turns into the regular definition of the derivative for functions on $\mathbb R$. When this linear operator $L$ exists, we say that $f$ is differentiable at $x$ and we write its value as $df(x)$. A key observation here is that $df(x)$ in general is not a number, but a function.

The main property that demystifies $d$ (at least for me) is the chain rule for the differential, which is the following: $$ d(f \circ g)(x) = df(g(x)) \circ dg(x) $$

Now for the actual answer to your question, consider first a function $x$ of time. When differentiable, the value of the differential $dx(t)$ will be a linear function of $h$, which for $\mathbb R$ means it's of the form $k \cdot h$. We know the value of $k$ depends on $t$ also, so we'll denote it with $x'(t)$ (recognize here the derivative). So we have $dx(t)(h) = x'(t) \cdot h$.

So we have $dx$ already but we are missing $dt$. We can consider that $dt$ stands for the differential of the identity function on time, so $dt(t)(h) = 1 \cdot h$, which we can replace in the previous relation to get $dx(t)(h) = f'(t) \cdot dt(t)(h)$. From here we can (again with some abuse of the notation) obtain the derivative as $x'(t) = \frac{dx}{dt}(t)$. Using this we can rewrite the differential as: $$ dx(t)(h) = \frac{dx}{dt}(t) \cdot h $$

For the last step, we add $y$ as a function of time. We also know that $y$ only depends on time via $x$, so in fact there exists a different function $f$ of $x$ so that $y = f \circ x$. This function $f$ is usually implicit and in my opinion can cause a lot of confusion. We apply the chain rule for $y$: $$ dy(t) = d(f \circ x)(t) = df(x(t)) \circ dx(t) \Rightarrow \frac{dy}{dt}(t) \cdot h = \frac{df}{dx}(x(t)) \cdot \frac{dx}{dt}(t) \cdot h $$ Next, we ignore the distinction between $f$ and $y$, omit where the derivatives are computed and obtain $$ \frac{dy(t)}{dt} = \frac{dy}{dx} \cdot \frac{dx}{dt} $$ but for me a more appropriate simplification would be $$ \frac{dy}{dt} = \frac{df}{dx} \circ x \cdot \frac{dx}{dt} $$

How to rigorously define the parametric derivative?

There are 2 best solutions below

Related Questions in DERIVATIVES

Related Questions in PARAMETRIC

Trending Questions

Popular # Hahtags

Popular Questions