Introduction to Chebyshev's Inequality

201 Views Asked by At

My book is introducing the need for Chebyshev's Inequality in the following way, and I'm confused at some of the constructs used here.

Suppose we have a biased coin, but we don’t know what the bias is. To estimate the bias, we toss the coin $n$ times and count how many Heads we observe. Then our estimate of the bias is given by $\hat{p} = \frac{1}{n}S_n$ , where $S_n$ is the number of Heads in the n tosses. Is this a good estimate? Let $p$ denote the true bias of the coin, which is unknown to us. Since $E(S_n) = np$, we see that the estimator $\hat{p}$ has the correct expected value: $E(\hat{p})=\frac{1}{n}E(S_n) = p$. This means when $n$ is sufficiently large, we can expect $\hat{p}$ to be very close to $p$; this is a manifestation of the Law of Large Numbers.

Starting from the beginning, this is my understanding:

$\hat{p} = \frac{1}{n}S_n$ comes from the fact that $n\hat{p} = S_n$

$E(S_n) = np$ because...

Let $X$ be the number of heads in $n$ coin tosses. Let $X_i$ be $1$ if the toss is heads and $0$ otherwise. Then, the $E[X_i] = Pr[X_i = 1] = p$. By linearity of expectation we can say $E(X) = np$

Th apart I'm confused about is the following line

...we see that the estimator $\hat{p}$ has the correct expected value: $E(\hat{p})=\frac{1}{n}E(S_n) = p$.

So they are definitely treating $\hat{p}$ as a random variable, and it apparently has a distribution. This sort of makes sense to me since $\hat{p}$ is an estimate so it can take on values $\hat{p} \in [0, 1]$. Apart from that I'm very confused about the right hand side of that statement. Any clarification would be much appreciated

EDIT: After reading my post, I realized the same logic I applied to the initial definition of $\hat{p}$ could be of use. Does this make sense?

$nE(\hat{p}) = E(S_n)$ ? If so, where does $p$ come from?

1

There are 1 best solutions below

0
On BEST ANSWER

Your basic concepts about Statistics are not clear. Let me try to help you.

$p$ is a parameter, which is unknown to you. You want to estimate it, i.e. have some idea about $p$. You thus do the following experiment: you take a coin whose probability of landing Head is $p$ and you toss the coin $n$ times. Let $X_i=1$ if the $i$-th toss is Head, and $X_i=0$ if $i$-th toss is Tail.

You correctly notes that $E(X_i)=P(X_i=1)=p$.

Now you believe that average of the number of heads would give you a good estimate of $p$, intuitively. So you take the estimator $\hat{p}=\dfrac{S_n}{n}$ where $S_n=X_1+X_2+...+X_n$.

Since $X_i$ are random, your $S_n$ is random, and thus your $\hat{p}$ is also random. You now want to calculate its expected value.

$E(\hat{p})=E(\dfrac{S_n}{n})=\dfrac{1}{n}E(S_n)=\dfrac{1}{n}E(\sum_{i=1}^n X_i)=\dfrac{1}{n} \sum{i=1}^n E(X_i)=\dfrac{1}{n}np=p$

So $\hat{p}$ is an unbiased estimator of $p$. Your $p$ is non-random but your $\hat{p}$ is. You observe $\hat{p}$ from data, not $p$.