Let $\ P$ be any distribution over the set $\ N= \{1,2,3,\ldots\}$ Now, let $S$ be a sample of size $k.$
We can define the missing mass variable $M_k$ as following: $\ M_k = \Pr(N \setminus S) $ namely, the probability mass of the elements which don't appear in the sample.
Now, I want to show that $M_k$ Convergences in probability to zero. So, by definition, I want to show that for all $\varepsilon , \delta$ >0 there is a $K$ s.t. if $k \geq K$ then $\Pr(M_k > \varepsilon) \leq \delta$
First I showed that the expectation $\operatorname E[M_k] = \sum_{i=1}^\infty p_i(1-p_i)^k$
Next I tried to use Markov inequality, and I got
$$\Pr(M_k > \varepsilon) \leq \frac {\sum_{i=1}^\infty p_i(1-p_i)^k} \varepsilon $$
So what I actually trying to find is $k$ s.t.
$$ \frac {\sum_{i=1}^\infty p_i(1-p_i)^k} \varepsilon \leq \delta $$
$$\implies \sum_{i=1}^\infty p_i(1-p_i)^k \leq \varepsilon \cdot \delta $$
But this is the point where I'm stuck - I can't find a way to isolate the $k$. Log of sum does not seem like a good option. any idea?
Thanks!
Let $\epsilon$ be fixed. Then there must be a finite set $T$ such that $P(T) \geq 1-\epsilon$. For such a set $T$, we have $$P(M_k > \epsilon) \leq P(\textrm{ at least one element in }T \textrm{ fails to appear somewhere in your sample })$$
You can bound the right hand side using Markov's inequality in a similar way to what you've already tried. The key advantage is that now you have a finite sum instead of an infinite one when you do the expectation (you only care about $i$ in $T$). So if you want the sum to go to $0$ (or to be less than $\delta$ eventually) it's now enough just to have each term of the sum going to $0$.