How sensitive is a natural cubic spline?

505 Views Asked by At

I am interested in using natural cubic splines to generate possible replacement values in the quality control of data. I would like to do this as close to real-time as I can. That is, I would like to use only one point (todays value) on the right of the value I wish to predict (yesterdays value) and more points (the past) on the left. My question is: How far in the past should I go? I know that a natural cubic spline does take into account every data point that you feed it. I just wonder how sensitive it is to say, 40 points versus, maybe 15 or 20 when the point I am evaluating at is so far to the "right".

If anybody has knowledge of this or could at least point me to further reading, I would appreciate it. Thanks.

2

There are 2 best solutions below

0
On BEST ANSWER

To understand the sensitivity to far-away data points, you should look at the graphs of the cardinal basis functions for the space of natural cubic splines. See the second set of pictures in this question.

As you can see, these functions decay to almost zero quite rapidly, (though they are always non-zero, except at knots). So, in your kind of application, I would say that the difference between using 15 or 20 points would be negligible. In fact, if it were me, I'd probably choose 6 to 8 points.

Also, you might consider scrapping the spline idea altogether. I'd suggest just interpolating a few nearby points using a low-degree polynomial. You can write down a closed-form formula for the interpolant, and this will make it easy to do the computations in real time.

0
On

In general, the error of the natural spline comes from the terminal points and is bounded by $O(h^2)$ where $h$ is a grid step. But also the real value of the second derivative of the function is important, i.e. how far it from from zero at the end points, after all the natural spline uses $S_0''(x_0)=0=S_n''(x_n)$ as boundary condition. Of course the closest you to the known data the better is extrapolation.

see also

Following is just an example and a piece of code to play with.

for i=1:5

    M(i)=fix((2^(i+3)));
    x=linspace(a,b,M(i));
    y=linspace(a,c,2000);
    %u=spline(x,f(x),y);
    pp=csape(x,f(x),'variational');
    u=ppval(pp,y);
    err(i)=norm(u-f(y),inf);
end

%plot(y,f(y),'b',y,u)
M    
err

I runned the above code with the following

a=-pi;
b=0;
c=pi;
f=@(x) sin(m*x);

for $m=2$

M   = 16        32        64        128       256       512       1024        

err = 34.4515   34.9169   35.0242   35.0501   35.0564   35.0580   35.0584   

and $m=10$

M   = 16        32        64        128       256       512       1024        


err =

   1.0e+03 *

    2.8969    4.6777    5.0283    5.1099    5.1298    5.1347    5.1359