What is an intuitive application of estimators?

258 Views Asked by At

So we're currently studying Estimators and we just proved Cramér-Rao's inequality and that when it is an equality, then whatever estimator we have is a unique MVUE.

All of this to me just sounds like fancy jargon. I don't understand:

  • the point of an estimator

  • why given an estimator $\hat{\tau}$ for $\tau$, to prove its unbiased we need to show that $E[\hat{\tau}] = \tau$

  • what the practical applications of an estimator are, and why knowing that it's biased or unbiased or minimum helps us

My professors basically just try to delve into the theory with no explanation of why all of this is important on a larger scale. Any help would be appreciated!

2

There are 2 best solutions below

5
On BEST ANSWER

1) An estimator is basically an attempt to calculate the value of a parameter based on the data. For example, an estimator of the mean is a function that attempts to use the data to discover the mean of a random variable. For example, one estimator of the mean is a sample mean, given by $$\hat{x} = \frac{1}{n} \sum x_n$$ where the $x_n$ are random variables and $n$ is the number of them. Estimators are usually marked with a little hat over the variable of interest, such as $\hat{\theta}$.

2) That is just the definition of unbiased. What the formula is asking whether the estimator $\hat{\theta}$ we created has expectation of the actual parameter $\theta$. This is to show that "on average" (which is expectation), our estimator is equivalent to the value of the actual parameter. This is quite a useful property to have, although sometimes biased estimators because of their strength of other properties.

3) Estimators are practically one of the most important tasks in statistics. We need it to predict values. For why we care about biased, unbiased, etc., we need some way to describe properties of estimators so as to say when one estimator is better than another. MVUE is an example of an estimator that is optimal with regards to some criteria, and thus "best" in this criteria's sense.

0
On

Before learning about estimators, you first need to learn that there is a distinction between "data" and "data generating process." An estimator is an attempt to learn something about the data generating process, by using the data.

Take a basketball player's shots in a game, like this:

Feet Made the basket?

7 Yes

10 No

8 Yes

25 Yes

10 No

12 No

In statistics, you view the Yesses and Nos as coming from processes specific to the feet from goal, like flipping coins that are bent, but with degree of bending related to feet from the goal. Clearly, 0% is not the true measure of the player's ability to make a shot from 10 feet (the "process"), as the data (0/2) seem to suggest. But 0/2 is indeed an estimate.

An "estimator" considers the potentially observable data values, rather than the actually observed data values, like this:

Feet Made the basket?

7 Y1

10 Y2

8 Y3

25 Y4

10 Y5

12 Y6

Say the Y's are coded as 0/1, 0 = Missed, 1 = Made.

The obvious estimator of the player's probability of making the shot from 10 feet is (Y2+Y5)/2. This is a random variable that can take values 0, .5 or 1.0. It is unbiased, but not a particularly good estimate, because we do not think that the player's true ability from 10 feet is either 0, .5, or 1.

Unbiasedness of an estimator is not the most important criterion for judging whether it is "good." Accuracy is more important: You want the estimator to be generally close to the target.

A better estimator would "borrow strength" from the nearby data, by assuming that the probability of making the shot is a continuous function of distance. Logistic regression will do that, giving you a better estimator (despite its bias), in the sense that it tends to be closer to the target than the estimator (Y2+Y5)/2 given above.

Thus, a major benefit of the estimator framework is that it gives you a rational way to decide between estimates, such as the simple average versus the logistic regression estimate referred to above. With estimates, all you have is two numbers. Which is better? There is no way to know. But in the estimator framework, you can compare the distributions of potentially observable values of the estimates, and pick the one that generally tends to be closer to the target.