Approaching the number of smokers of a whole population from a sample

108 Views Asked by At

I would like you to explain me how to approach my following problem. I have some issue to find paths to solve problem in probability in general. Here is my problem:

A fraction $p$ of citizens in a city smoke. We are to determine the value of $p$ by making a survey involving $n$ citizens whom we select randomly. If $k$ of these $n$ people smoke, then $p'=k/n$ will be our result. How large should we choose $n$ if we want our result $p'$ to be closer to the real value $p$ than $0.005$ with probability at least $0.95$? In other words: determine the smallest number $n_0$ such that $P(|p'-p| \leq 0.005)\geq 0.95$ for any $p \in (0,1)$ and $n \geq n_0$.

So here is what I tried: I noticed that $p=\mathbb E p'$ so I wanted to use Chebyshev's Inequality so I have: $P(|p'-p|\geq 0.005)\leq \frac{Var(p')}{0.005^2}$

Thus $P(|p'-p|< 0.005)\geq 1-\frac{Var(p')}{0.005^2}$

Then I want $1-\frac{Var(p')}{0.005^2} =0.95$ so $Var(p') =(1-0.95)\times 0.005^2$

But now ? I'm not sure about this approach, so I also tried to begin by defining an event $E=$a citizen smoke

Then $A_i=0$ if $E$ occurs with a probability $w$, $1$ if $E$ doesn't wp $1-w$. Then let $X=\sum_{i=0}^{n}A_i$ which is $Binom(n,w)$ (so it corresponds to the number of smokers in the $n$ citizens). But then I'm getting lost by understanding what is $p'$ in term of this evem though I have the feeling that I may be close (?).

Anyway, I need your help please.

Have a good evening,

Herosix

1

There are 1 best solutions below

2
On BEST ANSWER

The Chebyshev Inequality is

$$P(|X-\mu|<\epsilon)\geq 1-\frac{\sigma^2}{\epsilon^2}$$

The variance of $\frac{k}n$ is $\frac{p\cdot (1-p)}{n}$. We get the maximum of $p\cdot (1-p)$ if $p=0.5$.

$$P(|p'-p|<\epsilon)\geq 1-\frac{0.25}{n\epsilon^2}$$

In your exercise $\epsilon=0.005$. And the probability has to be greater than 0.95. Therefore the final inequality is

$$1-\frac{0.25}{n\cdot 0.005^2}>0.95$$