Suppose that we generated three giant samples of numbers from uniform, normal, and lognormal distributions.
I have heard (but can't find the resource) that "since the aforementioned distributions are derivable and have Lipschitz continuity (right?), then we can model the CDF of the samples (with error <$\epsilon$) using only a few splines (linear models)".
I do observe this experimentally, but don't know the mathematical theorems that convey this fact. In the following CDFs, for example, three CDFs are synthetics (generated from derivable distributions), and the other three are from real-world data. If I fit linear splines on these CDFs (with a small error tolerance), the real-world datasets need so many splines, while the synthetic ones (even the super-skew ones like lognormal) need just a few splines. I think this is because statistical distributions are very linear in short ranges [= when you zoom in the CDF, as shown in the small windows here:]
Intuitively, synthetic CDFs are generated by derivable functions (e.g., normal PDF), so the CDF becomes a straight line when we "zoom in"; but a real-world is not like that. Which statistical measure can show/measure this? There should be a notion of "local variance" or "local linearity" that defines how hard is it for the linear spline to fit to the data. Right?
Could you please show me the actual/precise statistical context that proves/suggest this phenomenon?
P.S. 1: The spline model will run against the same empirical data that it is trained on, so I am not concerned with generalization.
