Standard deviation of binned sample

84 Views Asked by At

I have to calculate the rms size of some sample (let's say 1D for this case). When I only have limited resolution my rms value gets bigger. What's the name of this effect again?

Is there a way to correct for it? Could I reduce the error by using this formula?

$$\sigma = \frac{\sigma_\mathrm{measured}}{1/\#\text{ binsUsedByThePicture} + 1}$$

The problem is that the formula $\sigma_\mathrm{measured}^2 = \sigma^2 + \sigma_\mathrm{resolution}$ doesn't really help me...

I did some tinkering with this Matlab code which gives a rather nice result, but of course this is by no means a proof:

function test
n = [1 2 3 4 5 6]*1e2;

x = rand(200000,1);
sx = std(x);
for i = 1:length(n)
    dx(i) = (stdHist(linspace(0,1,n(i)),hist(x,n(i))) - sx)/sx;
    dc(i) = (stdHist(linspace(0,1,n(i)),hist(x,n(i)))/(1/n(i) + 1) - sx)/sx;
end

figure, hold on
plot(n,dx,'r')
plot(n,dc,'g')
end

function sigma = stdHist(scale,hist)
    n = sum(hist);
    sigma = sqrt(sum(hist.*(scale - sum(hist.*scale)/n).^2) / n);
end

enter image description here

Any help would be greatly appreciated!

1

There are 1 best solutions below

0
On

You might want to Google the term "Sheppard's correction" (although I think in recent decades for some reason that topic hasn't been attended to very much; possibly it's unjustly neglected, and in particular, I'm not ready to write about details of it without reviewing it closely first).

If a random variable $X$ is normally distributed, then when $X$ gets rounded to the nearest integer, the rounding error is negatively correlated with $X$. Consequently estimates of variance based on rounded data will over-estimate. Sheppard's correction, ultimately derived from Euler--Maclaurin sums, attempts to compensate.

If $X$ is uniformly distributed, then the rounding error is uncorrelated with $X$, and rounded data underestimate the variance because they don't add the variance of the rounding error.