Determining the value of ECDF at a point using Matlab

4.7k Views Asked by At

I have a data $X=[x_1,\dots,x_n].$

In Matlab, I know by using

[f,x]=ecdf(X)
plot(x,f)

we will have the empirical distribution function based on $X$.

Now, if $x$ is given, how will I know the value of my ECDF at this point?

2

There are 2 best solutions below

1
On BEST ANSWER

You can use interpolation for this. In Matlab, interp1 (documentation) performs a variety of interpolation methods on 1-D data. In your case, you might try nearest neighbor or possibly linear interpolation, though you could attempt higher order schemes depending on your data. Nearest neighbor interpolation returns the point from your data $X$ that is closest to a supplied query point $x$ – here's an example:

rng(1);           % Sent random seed to make repeatable
Y = randn(1,100); % Normally distributed random data
[F,X] = ecdf(Y);  % Empirical CDF
stairs(X,F);      % Use stairstep plot to see actual shape
hold on;
X = X(2:end);     % Sample points, ECDF duplicates initial point, delete it
F = F(2:end);     % Sample values, ECDF duplicates initial point, delete it
x = [-1 0 1.5];   % Query points
y = interp1(X,F,x,'nearest'); % Nearest neighbor interpolation
plot(x,y,'ko');   % Plot interpolated points on ECDF

This produces a figure like this: enter image description here

Note that in the code above I had to remove the first point from the values returned by ecdf. This is because interp1 requires that the sample points (here X) be strictly monotonically increasing or decreasing.

0
On

Using interp1 is a nice idea. But we should not use 'nearest' option. Instead, to get the right result we must use 'previous' option because ecdf functions are flat except their jumping points. Also don't forget dealing with. I recommend, if [f, x] is given from ecdf command, to use

y = interp1(x, f, vec_eval, 'previous'); y(vec_eval < min(original data)) = 0; y(vec_eval >= max(original data)) = 1;

where vec_eval is the vector of points you want to evaluate