How do frequentists estimate probabilities?

86 Views Asked by At

Say I have i.i.d. bernoulli trials $X_1, \cdots, X_n$. Now I observe $\sum_1^k X_i$ and want to estimate the probability that $\sum_k^n X_i \geq t$, where $t$ is some treshold.

I am used to Bayesian thinking. Therefore, no prior => no estimate. If I have a prior, I can simply use the Bayes theorem to do this. How do frequentists solve this kind of problems?

I could estimate $p$ as the sample average of the first $k$ random variables. However, I should also factor in the chance that this estimate is wrong. It is not clear how to do this in a formally correct way.

Now, I could estimate $P[\hat p\leq p−ϵ]$ using the Hoeffding bound as described by Michael, sure. But how can I say anything about the probability of $p$ being something? I mean, say $p$ is $1/2$. Then the first $k$ trials are irrelevant and the frequentist estimate is totally wrong. So maybe the frequentist approach has some implicit assumptions which I don't understand and which allow me to do these things?

1

There are 1 best solutions below

3
On

Here is one approach:

Suppose $\{X_i\}_{i=1}^{\infty}$ are i.i.d. Bernoulli with unknown parameter $p \in [0,1]$, where $p=P[X_1=1]$. Suppose you want to estimate a value $f(p)$ for some function $f:[0,1]\rightarrow [0,1]$. Finally, suppose $f$ is $L$-Lipschitz so that $$ |f(x)-f(y)|\leq L|x-y| \quad \forall x, y \in [0,1]$$ Then from $k$ samples $\{X_1, ..., X_k\}$ we can define $$ \hat{p}_k = \frac{1}{k}\sum_{i=1}^k X_i$$ Let our estimate be $f(\hat{p}_k)$. Then for any $\epsilon>0$ we obtain: \begin{align*} P[|f(p) - f(\hat{p}_k)|\geq \epsilon] &\overset{(a)}{\leq} P[L|p-\hat{p}_k|\geq \epsilon] \\ &\overset{(b)}{\leq} 2 e^{\frac{-2k\epsilon^2}{L^2}} \end{align*} where (a) holds by the Lipschitz property and (b) holds by the Chernov-Hoeffding bound.