Confidence/Credible intervals implying that estimated mean from sample is equal to population mean in expectation?

110 Views Asked by At

I need some help understanding the fallacy in the following reasoning (thank you in advance!). It is essentially implying that for a single sample from a population, you can know a population parameter (like the EV) precisely with a few seemingly reasonable assumptions. I’ll frame it as an example.

Let’s say that we are trying to infer something about the return distribution (assume normal) of a financial instrument (population distribution). We make an assumption about the variance of this distribution. We then create samples and calculate their averages, and each sample consists of 100 draws from the population distribution (which again we don’t know).

Using our assumption about the variance of the population distribution, we can calculate the variance of the sampling distribution, which is the distribution which results after sampling (again, each sample being of 100 draws from the population distribution and the resultant average) infinitely many times. We can compute a 95% confidence or Bayesian credible interval using the variance of the sampling distribution, which implies that if we take 1 incremental sample and calculate its average, the probability that the population mean falls within said interval around the sample average is 95% (Yes I realize frequentists will challenge this for confidence intervals, but the credible interval unequivocally states this). So, if we take an incremental sample, it would follow that the probability that the population mean falls within the band described by the confidence interval around the estimated mean of the sample is 95%.

We’ve now basically created a probability distribution for the population mean itself, and the midpoint of the distribution is the estimated sample mean. If we assume that this distribution is normally distributed or even just that the likelihood that the population mean/EV falls within the 95% confidence band around the estimated sample mean or the in the 2.5% tails on either side of the band is uniform, then that would imply that in expectation, the expected value of the population mean/EV is the estimated sample mean.

It would seem obvious that that cannot be true given it would imply that that you could then know your true EV for any population simply by referencing a sample of any size. You can’t say Steph Curry’s free throw make probability is 50% after watching him shoot 2 free throws and miss 1.

Where exactly does this go wrong?

2

There are 2 best solutions below

0
On

Suppose we have a random variable $X$ with finite mean. That random variable has some distribution.

For a fixed $n$ and iid samples $X_1, \ldots, X_n$ from the distribution of $X$, we can construct the random variable $\frac{1}{n}\sum_{i=1}^n X_i$, which has its own distribution (possibly) distinct from the distribution of $X$. When you talk about the distribution we created from the sample, it doesn't sound like you're talking about this distribution.

If we observe actual values $x_1, \ldots, x_n$, we can construct the empirical distribution, which has cdf $F(x)=\frac{1}{n}\sum_{i=1}^n 1(x\leqslant x_i)$, where $1(x_i\leqslant x)=1$ if $x\leqslant x_i$ and $1(x_i\leqslant x)=0$ otherwise. This is also (possibly) distinct from the two previous distributions. When you talk about the distribution we created from the sample, it doesn't sound like you're talking about this distribution.

If we finally take the observed sample mean $\overline{x}=\frac{1}{n}\sum_{i=1}^n x_i$ and select some distribution which is symmetric about $\overline{x}$ (note that there are many such distributions), we get another distribution which is generally not equal to any of the previous three. When you talk about the distribution that we created from the sample, based on your comments, it sounds like you're talking about one of these distributions. But knowing this distribution doesn't tell us the population distribution (or the other two preceding distributions).

The fact that the confidence interval is symmetric about the sample mean doesn't mean that the population distribution (or the distribution of $\frac{1}{n}\sum_{i=1}^n X_i$ or the empirical distribution) is symmetric about the sample mean. For example, if we have a normal population with mean $0$ and we pick a sample which has sample mean $1$, the population distribution is still symmetric around $0$, not $1$. The confidence interval we construct is symmetric about $1$.

4
On

I don't understand most of what you've said; for example you say

We make an assumption about the variance of this distribution (let’s call said assumption sigma 1)

but then you never state what this assumption is, only that you've made it (?). Do you mean that you're assuming that the variance is equal to $\sigma_1^2$?

You also use three different terms "estimated sample mean," "population mean," and "estimated mean," and I don't know what these are supposed to be referring to. I see (at least) three different means here and this is what I would call them:

  • The true mean $\mathbb{E}(X)$ of the distribution being sampled from; is this what you mean by "population mean"?
  • The sample mean $\frac{X_1 + \dots + X_n}{n}$; is this what you mean by "estimated sample mean" or is this something else?
  • The expected sample mean $\mathbb{E} \left( \frac{X_1 + \dots + X_n}{n} \right) = \mathbb{E}(X)$, which equals the true mean. I don't know when if ever you're referring to this.

then that would imply that in expectation, the expected value of the population mean/EV is the estimated sample mean.

I don't understand most of what you've said in this paragraph, and I especially don't understand this sentence; here it sounds like "population mean" now means the sample mean, but now I don't know what you mean by "estimated sample mean"; if you mean the expected sample mean then this sentence is a tautology, but I don't know what else you could mean. (My sincere apologies for using the word "mean" so many times there!)

it would imply that that you could then know your true EV for any population simply by referencing a sample of any size.

Since I don't even understand what terms you're using, nor do I understand what argument you have in mind because of that, it's not at all obvious to me that anything you've said implies this. You are of course correct that obviously you cannot do this. As user 469053 appears to be suggesting in the comments, maybe you're confusing the sample mean with the expected sample mean; it is true that the expected sample mean is the true mean but you never learn the expected sample mean by sampling, only the sample mean, so there's no contradiction here.


I don't understand anything about frequentist statistics so here is how I would set things up in a Bayesian way. We take samples from a normal distribution of known variance $\sigma$ but unknown mean $\mu$ which we want to estimate. Given $n$ samples $X_1, \dots X_n$ from this distribution we can compute the log likelihood for observing our sample given a particular mean $\mu$, which is proportional to

$$- \sum_{i=1}^n (X_i - \mu)^2.$$

The maximum likelihood estimate for the mean $\mu$ is then the value of $\mu$ that minimizes this sum of squares, which is classically known to be given by the sample mean

$$\mu_{MLE} = \frac{X_1 + \dots + X_n}{n}.$$

More generally, given a prior on $\mu$ we can consider the maximum a posteriori estimate instead. The MLE estimate corresponds to a uniform prior over all values of $\mu$ which is an improper prior.

Instead of just considering the MLE we can consider the entire posterior distribution over possible values of $\mu$ given our sample, which (starting from the improper uniform prior) is a Gaussian with mean $\mu_{MLE}$. So the posterior mean, mode, and median are all equal to $\mu_{MLE}$. Now $\mu_{MLE}$ is itself a random variable, and it has the same mean as the distribution being sampled from, namely the unknown $\mu_{\text{True}}$. But again, we never learn this mean because we only ever get a specific sample; we can only compute $\mu_{MLE}$ and whatever else we want to compute from our specific sample.