Unbiased estimator for variance when each $x_i$ has probability $p_i$

41 Views Asked by At

Suppose we have a sample of 3 "observations" $x_0, x_1, x_2$ from a PMF of unknown mean, where we know that each observation $x_i$ has probability $p_i$.

Then a biased estimator of the mean is $\bar\mu := \sum_{i=0}^2 p_i * x_i$.

Also:

  1. An estimator of the variance is $\sum_{i=0}^2 p_i * (x_i - \bar{\mu})^2$, where $\bar{\mu}$ is as above.

  2. Another estimator of the variance is $\sum_{i=0}^2 {1 \over N} * (x_i - \bar{\mu})^2$, where $\bar{\mu}$ is as above.

  3. Another estimator of the variance is $\sum_{i=0}^2 {1 \over N -1} * (x_i - \bar{\mu})^2$, where $\bar{\mu}$ is as above.

Estimator 2) is what's usually used as an unbiased estimator (ie. Bessel's correction), but I think that should only make sense when the probability of each $x_i$ is the same (ie. each observation is equally likely, ie. $p_i$ equals $p_j$ for all $i$ and all $j$), which is not true in general, so I don't think Estimators 1) and 2) are correct for this situation.

  • What's an unbiased estimator for the variance in this case?

  • What are some (other) nice estimators for the variance in this case?

  • How can one add a Bessel-type correction to Estimator 0)?

  • Does any of this changes if, instead of knowing the probability $p_i$ of each $x_i$, we rather estimate the probability $p_i$ from the observations themselves?