From the textbook of nonlinear dynamics1, a theorem about continuous dependency of ODE on initial data:
My question
- What's the relationship between chaos (the so-called sensitivity on initial condition) and continuous dependence on initial conditions?
- So for the Lorenz system, one cannot apply the theorem above simply because the corresponding $\mathbf{f}$ is not locally Lipschitz in the vector $\mathbf{x}$?
Ref: 1 Khalil, Hassan K., and Jessy W. Grizzle. Nonlinear systems. Vol. 3. Upper Saddle River, NJ: Prentice hall, 2002.

The equations of the Lorenz system are polynomial, so of course it is locally Lipschitz. If you look at a typical picture of a path following the attractor, and the corresponding 3D bounding box from the coordinate axes is $[-20,20]\times[-25,25]\times[0,50]$. On this box the Jacobian of $$f(x,y,z)=(\sigma (y-x), x (\rho-z)-y, x y-\beta z),$$ with $\sigma=10$, $\beta=8/3$, $\rho=28$, is $$ J(x,y,z)=\begin{bmatrix} -σ&σ&0\\ρ-z&-1&-x\\y&x&-β \end{bmatrix}. $$ Any norm bound is a Lipschitz constant, so for instance the row sum norm $\max(20,49,47+\frac23)=49$. Thus taking $L=50$ for simplicity we get a worst-case error magnification factor of $e^{50}=5.1847\cdot 10^{21}$ per unit time step. This means that errors at the level of the double machine constant get magnified beyond any sensible margin in time less than $\Delta t=1$.
Note that this is a worst-case estimate, in the real computation the factors will be smaller, and the direction will change sign, so that the error can also decrease over some segments. Thus you get the time scales of $\Delta t=10$ to $20$ until a miniscule difference in initial conditions or a difference in the numerical method becomes visible in a plot.
In the theoretical result you would get from this worst-case estimate that you need $\delta\sim e^{-50(t_1-t_0)}\varepsilon$ to keep the solutions from close-by initial conditions also close in their trajectories. This would require a multi-precision data type where such a small difference is possible at all, a much higher working precision, and a numerical method with an order and time step that produces a theoretical error that is smaller than those bounds.