If instantaneous rates of change aren't that rigorous, how correct is the usage of instantaneous rates of change (like velocity) by physicists?

2.3k Views Asked by At

According to this answer, instantaneous rates of change are more intuitive than they are rigorous.

I tend to agree with that answer because, in the Wikipedia article on differential calculus, they aren't defining the derivative to be the slope at a particular point. They define it as, "The derivative of a function at a chosen input value describes the rate of change of the function near that input value." Although this isn't wrong, the definition has been written rather safely, and I think that was intentional. They didn't define it as the slope of the graph at a particular point. It is only in the explanations section of the Derivative wiki article that they did that: "The derivative of a function y = f(x) of a variable x is a measure of the rate at which the value y of the function changes with respect to the change of the variable x. It is called the derivative of f with respect to x. If x and y are real numbers, and if the graph of f is plotted against x, derivative is the slope of this graph at each point."

So, are physicists using terms like "instantaneous velocity" merely from an intuitive standpoint? What is the physical significance of instantaneous rates of change?

5

There are 5 best solutions below

0
On BEST ANSWER

It's perfectly fine to use intuition in applying mathematics - it's just that in mathematics itself we want rigorous definitions so we can actually prove stuff. We seek definitions that formalize our intuition about something.

Just consider the simple example of finding the derivative of $f(x) = x^2$. Using the "intuitive definition" it's not really clear that this should equal $2x$. You could of course look at a few examples and extrapolate from those that it should be true, but how can you really be sure? In contrast the "hard" definition (which can also be considered to be a rather intuitive one) directly allows you to construct the derivative.

The approach mathematicians often times take is to take some concept, state which properties it should have, try formalizing those and seeing if the resulting thing:

  • already "nails down" a concept well enough or if it's still too general
  • conforms to our intuition.

So we defined the mathematical concept of derivative the way we did, because it corresponds to our intuitive notion of rate of change and thus should be applicable in circumstances where the intuitive thing is asked for.

When applying math in physics, engineering etc. you of course always have to consider whether it makes sense to model some real life phenomenon via the mathematical idealized version: What assumptions go into a derivative? Are they compatible with the real world? Surely some notion of continuity is needed for a derivative. Is the real world continuous? We really don't know, and afaik (not a physicist) we can never find out. That's why physics is more than just building theories - we also need to do experiments to see if our theory corresponds with the real world up to some acceptable margin of error. And judging from experiments and how succesful we are in modelling the real world using differential calculus, it would seem that using the intuition behind derivatives in the real world isn't totally wrong.

8
On
  1. I disagree with the Answer cited in your first line: ‘instantaneous rate of change’ does have a formal meaning. Just because a concept is defined with reference to the details of its neighbourhood does not mean that the idea/concept is fuzzy or informal.

    Instantaneous rate of change $(R)$ at $q$ is coarsely analogous to the country of birth $(N)$ of $p:$ sure, the neighbourhoods of $q$ and $p$ exist independently of $q$ and $p,$ but this does not detract from the fact that the values of $R$ and $N$ at $q$ and $p,$ respectively, depend—in fact, are defined based—on the respective neighbourhoods of $q$ and $p.$ The definitions of $R$ and $N$ are rigorous, precise and unambiguous.

  2. You agree that for a car moving at a constant $70\mathrm{km/h}$ velocity, its instantaneous velocity at $t=100\mathrm s$ rigorously and precisely equals $70\mathrm{km/h}.$

    Presumably, you also agree that even if the car was travelling at a varying velocity, it continues to have some instantaneous velocity at $t=100\mathrm s.$

    If so, then your question becomes: is the formalisation of ‘instantaneous velocity’ (rate of change, derivative) somewhat arbitrary in caputuring the actual instantaneous velocity? I'd say, it really does “accurately”  and—at the risk of sounding circular—definitively reflect the actuality.

4
On

It is not that they aren't rigorous, it's that calculus books, as usual, don't necessarily take the care to make the relevant distinctions to make it fully rigorous. They can be made rigorous.

One thing I'd argue is that the "instantaneous rate of change" is something which can be defined as formally equivalent to, but conceptually distinct from, the derivative, with the derivative being more general. A derivative, in the case of a function of a real variable, is a certain quantity that characterizes the local behavior of such a function around an input point and how it responds to small changes in that input or, better, how its output differs when considering input values slightly different from a particular input and compared against the value it attains at that particular input.

The reason I say this is because the concept of a "rate of change" implicitly presumes a flow of time, and not all derivatives involve time.

The derivative of $f$ at $a$ is defined by

$$f'(a) := \lim_{\Delta x \rightarrow 0} \frac{f(x + \Delta x) - f(x)}{\Delta x}$$

as you already know.

But now for the instantaneous rate of change. Parsing that term, we'd ideally want to say that, to make the intuition in it rigorous, we should define both what a "rate of change" is and, moreover, what it means for that rate to be "instantaneous".

So how do we do that? Analyzing the term further, we see we need to define "change" and "rate". The change - before we get to "rate of" - of a temporally-varying quantity from time $t_1$ to time $t_2$, given as a function $f$ of time, is thus defined by

$$\text{Change in $f(t)$ from time $t_1$ to time $t_2$} := f(t_2) - f(t_1)$$

i.e. change is just subtraction (difference). The rate of change then, is the ratio of two changes (note that the "change in time" can be understood as the change of the identity function of time, so we don't need another definition):

$$\text{Rate of change of $f(t)$ from time $t_1$ to time $t_2$} := \frac{\text{Change in $f(t)$ from time $t_1$ to time $t_2$}}{\text{Change in time from $t_1$ to $t_2$}}$$

from which we can see that

$$\text{Rate of change of $f(t)$ from time $t_1$ to time $t_2$} = \frac{f(t_2) - f(t_1)}{t_2 - t_1}$$

So what then is the instantaneous rate of change? Logically, it is the rate of change at a single instant, i.e. when $t_1 = t_2 = t_a$ at a particular instant $t_a$. However, we cannot achieve that with the above definition because we get a division-by-0 error. Instead, what we must do is use a limit to fill it in - in particular, we should take the following two-dimensional limit:

$$\text{Instantaneous rate of change of $f$ at $t_a$} := \lim_{(t_1, t_2) \rightarrow (t_a, t_a)} \frac{f(t_2) - f(t_1)}{t_2 - t_1}$$

where that we only consider points $t_1$ and $t_2$ such that both $t_1 \le t_a \le t_2$, i.e. the intervals of change "bracket" our desired point $t_a$, and $t_1 \ne t_2$. Then we have

Theorem: If the IRoC of $f$ exists at $a$, then it equals $f'(a)$.

0
On

In Mathematics, definitions and systems are not written once and baked into clay.

Instead, they develop over time. For derivatives, you can go back to Newton and Leibniz who developed ways to describe the slopes of equations using different syntax and different definitions. By todays standards, neither definitions where "formal".

Over time, the syntax we use to talk about derivatives, the terms we use, and the formal definitions have evolved.

The most common formal definition we use for derivatives of one dimensional functions is that epsion-delta based one. There are others which can be shown to give the same results on the set of common functions we collectively and intuitively agree on what "slope" means.

On edge cases, two different ways of talking about derivative may describe things differently; those edge cases tend to be pretty weird.

And then you go up and start talking about abstractions of the derivative. What happens when instead of functions from $\mathbb{R}$ to $\mathbb{R}$, we are talking about $\mathbb{Z}$ to $\mathbb{Z}$ or complex numbers or quaternions or polynomials or vectors or more exotic group structures or Lie algebras.

In those contexts, you'll find derivative-like structures that map back to important features of the derivative, be they "this is like a slope" or "it has similar algebraic properties as the differential operator".

But what makes it the derivative is that when you have a reasonably well behaved drawing of a line that doesn't fold over, the value it assigns to each point along the line is the slope of the line you drew. That is the core concept that unifies the derivative of Newton to modern epsilon-deltas or infinitesimal based derivative definitions.

Working with that directly is like cooking food when your definition of food is "stuff critters eat". Any technical definition of food has to be consistent with that. Any gastronomic definition of food has to be consistent with that. But that isn't a useful working definition for almost any practical purpose.

If tomorrow someone came up with and shared a better definition of derivative that encompassed the idea of instantaneous rate of change but turned out to be much more useful than existing ones, over a period of decades I'd expect the current epsilon-delta based definition to fall in prominence. The new definition would still be "the derivative" (or would eventually be it), so saying that the epsilon-delta definition is what the derivative "is" is misleading over the medium or long term.

0
On

I think you failed to realize that "slope of $f$ at $x$" is meaningless until you define it, but how can you define it precisely? That is the real problem.

You cannot just say "tangent", either, because that just swaps the problem for a worse problem. Not only is it difficult to define "tangent", it fails on cases like $f : ℝ→ℝ$ defined by $f(0) = 0$ and $f(x) = x^2·\cos(1/x)$ for every $x∈ℝ_{≠0}$, if you try to define "tangent" in terms of a line that touches only once in a small open disk around the point.

Ultimately, one of the best ways to define "slope/gradient of $f$ at $x$" is via the standard definition. You might ask "Why that definition?", and the answer is:

Because that definition gives nice properties, and even corresponds with intuition!

See, suppose you have a smooth hillside, and you happen to know a function $f$ that captures its elevation for every coordinate along a straight path up that hill. Then the rigorous definition of $f'$ actually gives you something that coincides with the intuitive notion of slope! The reason is that slope is intuitively "how fast it goes up", and you cannot measure it at a single point but you can measure it near a point. If you zoom in on that point on the hillside and it gradually looks more and more straight as you zoom in, then it is (rigorously provable to be) differentiable there and indeed $f'$ at that point is exactly the slope of the line that you approach as you zoom in.

Similarly, the instantaneous speed of a vehicle cannot really be measured in real life, but what the speedometer measures is how many rounds the wheel turns over a small time interval. In fact, you should notice that if you jerk the accelerator pedal on and off rapidly, the speedometer actually fails to show you the rapidly changing speed, precisely because it does not measure instantaneous speed and its time resolution is not that good. But if you travel on a relatively smooth journey, then the mathematically rigorous derivative matches closely with what the speedometer shows, because each small part of the speed-time graph is close to linear.