Best piecewise constant estimator in density estimation

85 Views Asked by At

Assume that $X_1,\dots,X_n$ are i.i.d. random variables in $[0;1)$ and whose probability distribution has density $f_0$ that we would like to recover.

One well-known estimator is the histogram estimator, for instance on a regular grid of step $p^{-1}$:

$$\hat{f}(x)=\frac{p}{n}\sum_{i=0}^{p-1}1_{x\in[ip^{-1};(i+1)p^{-1})}\ \#\Big\{1\leq j\leq n \,\Big|\,X_j\in[ip^{-1};(i+1)p^{-1})\Big\}$$

It is then a piecewise constant estimator whose values are determined by the empirical frequency in each bin.

Now, does there exist some other piecewise constant estimator on the same partition that could be better in some sense (having a better MISE or MSE for instant), particularly in the asymptotic setting.

If the partition introduced above is $\left(I_1,\dots,I_p\right)$, the estimator $\hat{f}$ is equivalent to an estimator of the vector of probabilities $\left(P(I_1),\dots,P(I_p)\right)$. If we want to estimate $P(I)$, for $I$ a measurable subset of $[0;1)$, from the observations $1_{X_i\in I}$, we can show that the empirical frequency attains the Cramer-Rao bound, (it's like estimating the parameter of a Bernoulli distribution). So, if we are interested in the MISE of our estimator, it makes sense to use $\hat{f}$. However, in the original problem, we also now where the $X_i$ lie exactly in $[0;1)$. So, we could think of using this knowledge to build a piecewise constant estimator.

Why is it maybe not sensible and why is $\hat{f}$ the only piecewise constant estimator that seems to be studied?