There is a theorem in page 100 of Arnold's Mathematical Methods of Classical Mechanics, which says that:
If $\cfrac{dx}{dt} = f(x) = Ax + R_2(x)$, where $A = \cfrac{\partial f}{\partial x}|_{x = 0}$, $R_2(x) = O(x^2)$, and $\cfrac{dy}{dt} = Ay$, $y(0) = x(0)$, then for any $\vphantom{\cfrac12} T>0$ and for any $\xi > 0$ there exists $\delta > 0$ such that if $|x(0)| < \delta$, then $\vphantom{\cfrac12}|x(t) - y(t)| < \xi \delta$ for all $t$ in the interval $0 < t < T$.
How can I prove this theorem rigorously?
Cross-posted at math.se here.
The difference between $x$ and $y$ satisfies the equation $$\frac{\text d}{\text dt}(x(t)-y(t))=A(x-y)+R_2(x).$$
This can be solved, formally, by variation of parameters, to give $$x(t)-y(t)=\int_0^t e^{A(t-\tau)}R_2(x(\tau))\text d\tau.$$ Their separation is then bounded by $$\begin{align}|x(t)-y(t)| \leq\int_0^t | e^{A(t-\tau)}R_2(x(\tau))|\text d\tau \leq\int_0^T e^{|A|(t-\tau)}|R_2(x(\tau))|\text d\tau.\end{align}$$
Now, $R_2(x)=O(x^2)$, which means that there exists a constant $M$ and a distance $X$ such that $|R_2(x)|\leq Mx^2$ whenever $|x|\leq X$. To be able to use this, we need to ensure that $|x(t)|$ will be no greater than some $\delta$-controllable constant for all times $t\in[0,T]$, and our tool to achieve this is making the initial condition $x(0)=x_0$ small enough. This is essentially the statement that the solution is continuous with respect to the initial condition (i.e. around the solution $x(t)\equiv0$ for $x(0)=0$).
This continuity is a standard fact in differential equations but I'll sketch a proof here. Any solution $x(t)$ must be differentiable, hence continuous, and hence bounded. This means the condition $|R_2(x(t))|\leq M_{x_0}x(t)^2$ will hold for all $t\in[0,T]$, for some suitable constant $M_{x_0}$. The right-hand side of the differential equation for $x(t)$ is therefore Lipschitz continuous: $$|Ax(t)+R_2(x(t))|\leq L|x(t)|\tag1$$ for some $L>0$. (Alternatively, you might impose this to begin with.) Setting $u(t)=|x(t)|^2$, you have $$ \frac{\text du}{\text dt} \leq\left|\frac{\text du}{\text dt}\right| = \left|2x(t)\frac{\text dx}{\text dt}\right| \leq 2L |x(t)|^2=2L u(t). $$ By Gronwall's inequality this implies that $|u(t)|\leq|u(0)|e^{2L t}$, which means that $|x(t)|\leq |x_0| e^{LT}$ for all $t\in[0,T]$.
So, in a nutshell: if you can assume or prove that (1) holds with $L$ independent of $x_0$, then your solution $|x(t)|$ will be bounded by $|x_0| e^{LT}$ for all $t\in[0,T]$. Additionally, this means that if you choose $|x_0|<\delta\leq Xe^{-LT}$, you can bind $|x(t)|<X$ for all $t\in[0,T]$.
Let's now return to the difference $|x(t)-y(t)|$. We can now apply the smallness bound on $R_2$, which gives us $$\begin{align} |x(t)-y(t)| &\leq\int_0^t e^{|A|(t-\tau)}|R_2(x(\tau))|\text d\tau \leq\int_0^t e^{|A|(t-\tau)} M|x(\tau)|^2\text d\tau \\& \leq M\int_0^T e^{|A|T}|x_0|^2 e^{2LT}\text d\tau \leq \delta^2 MT e^{(2M+|A|)T} \end{align}$$ for all $t\in[0,T]$.
Thus, if you're given $\xi>0$ and $T<0$, then choosing $\delta=\min\{Xe^{-LT},\xi e^{-(2M+|A|)T}/MT\}$ you can ensure that $$|x(t)-y(t)|<\xi\delta$$ for all $t\in[0,T]$.
Finally, note that while this is phrased in one-dimensional language, it holds with trivial extensions (i.e. simply with appropriately defined norms) for the multidimensional case, and therefore for higher-than-first-order ODEs.