How to value smoothness of a dataset?

846 Views Asked by At

I have a dataset, which I can see the occurance of different angle of wind speed each year, like this:

enter image description here

But the dataset is not very clean, sometimes, the distribution would look like:

enter image description here

Notice the angle(x-axis) ranging from 260-320, it's not as smooth as the above plot.

I'm wondering if there is an indicator that I can use the measure the overall smoothness of such distribution? So I can know when the data is corrupted and need corretion.

What I can think of is to calculate the difference of each bins.

BTW: What tag should this question belong to?

1

There are 1 best solutions below

1
On

A first approximation would be to calculate the variance of the differences between the samples. Assuming your values on the y axis are $y_0, y_1, ..., y_{36}$, you would get $$\overline{Var}(\Delta Y) = \frac{1}{n-1} \sum_{i \in {0,...,35}}((y_i - y_{i+1})-\overline E[\Delta Y])^2 $$.

Second and probably much more exact method is to fit e.g. a $sin()$ model to the data (either for each year or for the whole population) and then calculate the difference from that model to each $y$.