Sampling with replacement. $P[\upsilon_m>n]=\prod_{i=2}^{n}(1-\frac{i-1}{m})$.

84 Views Asked by At

Let $\{X_n,n \leq 1\}$ be i.i.d. and uniformly distributed on the set $\{1,...,m\}$. In repeated sampling, let $v_m$ be the time of the first coincidence; that is, the when we first get a repeated outcome $$\upsilon_m:=\inf\{n\geq 2:X_n \in \{X_1,X_2,...,X_{n-1}\}\}.$$
Verify $$P[\upsilon_m>n]=\prod_{i=2}^{n}(1-\frac{i-1}{m}).$$

In an exercise for convergence in distribution, this probability came up but I'm not able to verify it. I'm guessing is a similar argument from the birthday problem.

1

There are 1 best solutions below

1
On BEST ANSWER

Note that: $$ \{v_m>n\} = \{X_1,X_2,\dots,X_n\text{ are all different}\}\triangleq E_n. $$ This implies, \begin{align*} \mathbb{P}(v_m>n)=\mathbb{P}(X_1,X_2,\dots,X_n\text{ are all different})\triangleq p_n. \end{align*} We now compute $p_n$ in terms of $p_{n-1}$, which is the probability of the event that $X_1,\dots,X_{n-1}$ are all different. Note that, $$ p_n =\mathbb{P}(X_1,X_2,\dots,X_n\text{ are all different}) = \mathbb{P}(E_n|E_{n-1})\mathbb{P}(E_{n-1})=\mathbb{P}(X_n\notin\{X_1,\dots,X_{n-1}\})p_{n-1}. $$ Thus, $\frac{p_n}{p_{n-1}}=1-\frac{n-1}{m}$. Now, with the same logic, we have $\frac{p_k}{p_{k-1}} = 1-\frac{k-1}{m}$, for every $k\geqslant m$. Multiplying this events and capturing a telescoping product, we precisely recover the desired claim.