Failing to understand some basic idea behind differentiation

117 Views Asked by At

I just discovered I must have some big holes in my knowledge of basic calculus, and this is scary honestly.

I have to compute some derivatives of the solution of a dynamical system: \begin{equation*} \frac{\text d y(t)}{\text dt} = f(t,y(t)),\quad y(t_0) = y_0,\quad t_0\leq t\leq T. \end{equation*}

Say that I have to compute derivatives with respect to $t$. Clearly, $\dfrac{\text d y(t)}{\text dt}$ is given.

  1. I want to compute $\dfrac{\text d y(t)}{\text du}$ with $u<t$. I write: \begin{equation*} y(t) = y(u) +\int_u^t f(\tau,y(\tau))\text d \tau\therefore\dfrac{\text d y(t)}{\text du}=\dfrac{\text d y(u)}{\text du}+\dfrac{\text d}{\text du}\int_u^t f(\tau,y(\tau))\text d \tau=f(u,y(u))+? \end{equation*} The question mark stays for the fact that I have some uncertainties in how to compute the derivative of the integral by Leibniz rule. I will not report here all my doubts, I could fill pages.

  2. I assume $\dfrac{\text d y(t)}{\text du}=0$ with $u>t$ for physical reasons (how can future influence past?), but is it actually true? If I unwind all the computation I will get by chain rule some terms like $\dfrac{\text d t}{\text du}$. Intuitively, it should be zero, but since $\dfrac{\text d t}{\text du}=\left(\dfrac{\text d u}{\text dt}\right)^{-1}$, then I would set it to 1.

  3. the specific case $\dfrac{\text d y(t)}{\text d t_0}$ is the funniest. I get different results when computing it as \begin{equation*} \dfrac{\text d y(t)}{\text d t_0} = \dfrac{\text d}{\text d t_0}\left(y_0+\int_{t_0}^t f(\tau,y(\tau))\text d\tau\right) = -f(t_0,y_0) \end{equation*} or \begin{equation*} \dfrac{\text d y(t)}{\text d t_0} = \dfrac{\text d y(t)}{\text d t}\dfrac{\text d t}{\text d t_0} = f(t,y(t)) \end{equation*}

  4. what about $\dfrac{\text d y(u)}{\text d y(t)}$ with $t<u$, by Leibniz rule? I obtain different results writing $y(u)=y(t)+\int_t^u f(\tau,y(\tau))\text d\tau$ or $y(u)=y_0+\int_0^u f(\tau,y(\tau))\text d\tau$

  5. $\dfrac{\text d y(u)}{\text d y(t)}$ with $t>u$ would be 0 for physical reasons, or the inverse of what results in point 4, by algebra of differentials.

How would you solve these doubts? I think I don't get completely the meaning of derivative ..

1

There are 1 best solutions below

6
On BEST ANSWER

Premise. As already pointed out in the comments, I believe your confusion stems from a somewhat excessive use of the Leibniz notation for the derivative, rather than a poor understanding of the derivative itself. While the notation $\rm{d}y/\rm{d}t$ is undeniably intuitive and `agile' for some things it is not the most precise notation possible. In particular, it makes a complete mess of the point at which the derivative is being evaluated, making it impossible to distinguish between the independent variable and some fixed point. I suggest we ditch it completely, and instead write $y'(t)$ for the derivative of $y$ at the point $t$. With this in mind, let me address your points one by one

1) If $u,t\in[t_0,T]$ and $u<t$, it's certainly true that $$y(t)=y(u)+\int_u^tf(\tau,y(\tau))\,\rm{d}\tau,$$ but it's not at all clear what $\rm{d}y(t)/\rm{d}u$ even means. If you mean the derivative of $y$ evaluated at the point $u$, i.e. $y'(u)$, then this is just $f(u,y(u))$. Otherwise, you could say $t$ is now fixed and $u$ is your independent variable (this effectively reduces your interval to $[t_0,t]$). Now that the meaning of all symbols has been clarified, you can take a derivative of both sides of your equation: the LHS vanishes because $y(t)$ is now a constant, whereas the RHS reads $$y'(u) - f(u,y(u)) = y'(u) - y'(u) = 0,$$ i.e. you get the trivial identity $0=0$, as you should.

2) Your writing in this point corroborates the idea that to you $\rm{d}y(t)/\rm{d}u$ must stand for $y'(u)$. Let me point out, however, that you have absolutely no reason to assume $y'(u)$ vanishes for $u$ greater than some fixed $t\in[t_0,T]$, unless the function $f$ dictates so. For instance if you take $f(t,u(t))=A$ where $A$ is a non-zero constant this is false, but the equation $y'(t)=f(t,y(t))$ makes perfect sense regardless (and it's easily solved). The equation you solve at the end of your post is another example. This is because while we use differential equations to describe the time evolution of physical systems, they don't have to describe them. Thus if you want a system to be causal, that's a condition you have to impose on the system, otherwise it may well be false.

3) Here is where you can see clearly how the Leibniz notation falls short—and, at the same time, how useful it is. The inverse function is defined by the identity $y(t)=\eta \Longleftrightarrow y^{-1}(\eta)=t,$ then under some very reasonable circumstances it holds $$(y^{-1})'(\eta) = \frac{1}{y'(y^{-1}(\eta))} = \frac{1}{y'(t)}.$$ Notice that the LHS is evaluated at a different point than the RHS, but if we forget about this and use the Leibniz notation (which does not specify the evaluation point anyway) the identity above reads $$\frac{\rm{d}y^{-1}(\eta)}{\rm{d}\eta} = \frac{\rm{d}t}{\rm{d}\eta} = \frac{1}{\rm{d}u/\rm{d}t}.$$ Now we may formally write this as $$\frac{\rm{d}t}{\rm{d} y}=\left(\frac{\rm{d} y}{\rm{d}t}\right)^{-1},$$ so long as we understand this is just a notational shorthand and not a full-fledged identity. You should now be able to see why the correct identity is $$y'(t_0) = f(t_0,y_0),$$ and not $$y'(t_0) = -f(t_0,y_0).$$

4,5) I believe my previous answers address these points as well. In particular, there is no "algebra of differentials", only formal differential identities which mock specific theorems, and as such they should be taken with a grain of salt.