The problem is as follows:
A survey is being conducted in a city with a million ($10^6$) people. A sample of size 1000 is collected by choosing people in the city at random, with replacement and with equal probabilities for everyone in the city. Find a simple, accurate approximation to the probability that at least one person will get chosen more than once. Hint: Indicator r.v.s are useful here, but creating 1 indicator for each of the million people is not recommended since it leads to a messy calculation. Feel free to use the fact that 999 ≈ 1000.
I'm a little lost on how to accomplish this. My initial thought is that this sounds kind of like the capture-recapture problem where your new sample is size of 1, and your initial capture sample keeps increasing by 1 for each trial. I then realized that this would only increase if the person hadn't been chosen before. I think I am making this too complicated and I am not sure how I would end up doing that out or how to incorporate indicator random variables in that answer. Additionally, I am confused as to how 999 ~ 1000 is useful. Any ideas on how to get started with this would be helpful.
This is equivalent to the birthday problem. Stated another way, given a generalized year with $10^6$ days, and $1000$ people in a room, what is the probability that two of them have the same birthday? There are $\binom{1000}{2}$ pairs of people. Let $X$ be the number of the same birthdays (i.e. the same person was chosen twice). Then $$X=\sum_{i=1}^{\binom{1000}{2}}I_i$$ where $$I_i=\begin{cases} 1 &\text{ if they share the same birthday }\\ 0 &\text{ else } \end{cases}$$ Note that $I_i$ are independent Bernoulli variables with $p=1/10^6$. So $X$ is distributed approximately Poisson with $\lambda =\binom{1000}{2}\frac{1}{10^6}=\frac{1000(999)}{2}\frac{1}{10^6}$. Then using the hint that $999\approx 1000$ we get that $\lambda\approx\frac{1}{2}$. So the probability that $X>0$ is \begin{equation} P(X>0)\approx 1-e^{-1/2} \end{equation}