Comparing the Speeds of Cars and Airplanes : Fisher Information of Estimators

23 Views Asked by At

Consider the following example:

  • An airplane can travel faster than a car
  • However, there are certain situations in which a car might be able to travel faster than an airplane (e.g. very short distances, in which an airplane would have to not have been able to take off the ground)

Thus, we summarize this situation as follows : In many cases, an airplane can travel faster than a car, however in some situations a car can travel faster than an airplane.

I would like to now apply this analogy of comparing cars and airplanes to ranking and selecting Statistical Estimators based on their Fisher Information (https://en.wikipedia.org/wiki/Fisher_information).

In simple terms, the Fisher Information measures the amount of information than an observed Random Variable carries about some Parameter in some Probability Distribution.

In the context of Statistical Estimators, we say that Estimators with larger values of Fisher Information are better than Estimators with smaller values of Fisher Information. This concept is closely related to the Cramer-Rao Lower Bound (https://en.wikipedia.org/wiki/Cram%C3%A9r%E2%80%93Rao_bound) - the lowest possible Variance a given Statistical Estimator can achieve, regardless of the data being used.

Suppose we want to estimate some "property" (e.g. "mean") of some observed data - we might have two Estimators that we are considering : "Estimator A" and "Estimator B". Often, to justify the use of one of these estimators over the other, we compare the Cramer-Rao Lower Bounds of both of these estimators. This is because we are generally interested in Estimators with overall lower variances (i.e. Lower Variance is proportional to higher Cramer-Rao Bounds)

As an example, if my friend and I both measure someone's height:

  • If I report my measurement to be 175cm ± 13cm
  • My friend reports a measurement of 175cm ± 5cm
  • Although we both reported the same height, my friend's measurement appears to be more "precise" as my friend seems to be more certain and confident in the measurement they collected. What we are seeing here is that the my friend (i.e. the estimator) reported a measurement with "lower variance"

Going back to the previous example - given some observed data and the choice of using Estimator A or Estimator B, provided that both of these estimators are otherwise comparable (e.g. they measure the same property, both "unbiased" https://en.wikipedia.org/wiki/Bias_of_an_estimator), we can calculate the theoretical Cramer-Rao Bound of both of these estimators and see which of these estimators has a higher Cramer-Rao Bound and therefore a theoretical lower possible variance. Then, we can justify our choice of estimator based on this rationale.

This leads me to the following question: By using this "absolute decision criteria" (i.e. estimator with the higher Cramer-Rao Bound and lower possible variance) - is it possible that sometimes this decision criteria can lead us to a similar situation as the "Car vs. Airplane" scenario?

In other words, the Cramer-Rao Lower Bound quantifies the "best possible performance of an estimator" (in terms of variance) - but should we not also take into consideration the "average possible performance of an estimator"? (I don't even know if there is a Cramer-Rao Average Bound?)

If I score either 100 or 0 on exams where as my friend consistently scores 71% on exams - isn't my friend the more reliable student? Shouldn't Statistical Estimators be held to similar standards?

Thanks!