Point estimate of a parameter out of the interval of allowable parameter values

37 Views Asked by At

I have a question about point estimators. In general, if I have an estimator for a parameter, let's say $p$, and $p \in (0, 1)$,and out of the obtained data I get that the point estimate equals $-0.01$, what should be the conclusion? That the estimator cannot be used or that we should get more data and recalculate the estimator or should I say the estimate is 0? But $p\in (0,1)$ and 0 is not in the interval. I got this from a real situation, the point estimators work perfectly for more data but sometimes for less data, they just don't give me a result in the interval $(0,1)$

A bit similar problem is when i want to estimate the parameter p from bernoulli distribution. The reasonable point estimator is $\sum_{i=1}^n \frac{x_i}{n}$ where $x_i$ are obtained data. And my obtained data is $(0, 0, 0, 0, 0, 0),$ then the estimator gives us 0 but 0 is not in $(0,1)$. So what is the conclusion? Thanks for any ideas!

1

There are 1 best solutions below

0
On

There are various situations in frequentist statistics that may lead to an estimate of a parameter $\theta$ outside the range in which $\theta$ is defined. Sometimes a better estimator can be found that does not present this difficulty.

I wish you had given an example in which $\hat \theta = -0.01$ and by definition $0 < \theta < 1.$ Then I might have been able to give an explanation for this behavior, and perhaps suggest an interpretation. (An embarrassingly common interpretation is something like, "Oh well, $\theta > 0$ must be really small.)

Examples, without giving all the details:

(a) One such instance arises when the prevalence of a disease is estimated from the fraction of subjects with a positive result on a medical screening test. Screening tests are not perfect. Even in a population with no diseased subjects, one expects a certain proportion of (false) positive results. If the observed proportion of positive tests is lower than this expected proportion, then the estimate of disease prevalence may be negative. See Example 5.2 in Suess (2010): Intro. to Probability simulation... .

(b) Consider a one-way random effects ANOVA with model $Y_{ij} = \mu + A_i + e_{ij},$ where $A_i \stackrel{iid}{\sim} \mathsf{Norm}(0, \sigma_A)$ and] $e_{ij} \stackrel{iid}{\sim} \mathsf{Norm}(0, \sigma_e).$ One can sometimes estimate $\sigma_A$ by a formula that involves subtraction of two (positive) sums of squares. But if $\sigma_A$ is small compared with $\sigma_e,$ the difference, and thus the estimate of $\sigma_A,$ may be negative--an nonsensical result.

By contrast, in Bayesian statistics, one would begin with a prior distribution with strictly positive support -- perhaps a Beta or gamma distribution. Then the posterior distribution, used for estimation, must have strictly positive support, and such 'illegal' estimates cannot occur. [This is hardly the main reason for using a Bayesian framework for inference, but it is a nice side-effect.]

The example in your second paragraph is more of a philosophical issue than a probabilistic one. If $0$ data values are possible, then it seems one might start with $0 \le \theta \le 1$ (and regard $0^0 \approx \epsilon^0$ as a limit with value $1$).