My question is about calculating the expected value of the amount of distinct dice faces obtained (observed at least once) by rolling a $n$-faced dice $m$ times.
Assume a $n$-faced unbiased dice is rolled $m$ times, and the probability of obtaining $k$ ($k=1,\cdots,n$) kinds of dice faces is denoted as $P_{n,m}(k)$.
With $P_{n,m}(k)$, the expected value of obtained dice faces by rolling $n$-faced dices $m$ times is defined as follows:
$E=\sum_{k=1}^{n} k\cdot P_{n,m}(k)$
I want to calculate $E$ in constant time for my program, so I want to know the explicit formula of this $E$, or an approximation formula of $E$.
I was able to formularise the recurrence relation of $P_{n,m}(k)$ as below, while I could not find out the explicit formulas of $P_{n,m}(k)$ and $E$:
- $\sum_{k=1}^{n}P_{n,m}(k) = 1$
- Boundary conditions: $P_{n,1}(k=1) = 1, P_{n,1}(k>1) = 0$
- Recurrence relation 1: $P_{n,m+1}(1) = \frac{1}{n} P_{n,m}(1)$
- Recurrence relation 2: $P_{n,m+1}(k) = \frac{k}{n} P_{n,m}(k) + \frac{n - (k-1)}{n} P_{n,m}(k-1)$
I could calculate $E$ for specific $n,m,k$ values by calculating every $P_{n,m}(k)$ step-by-step, and worked well for small $n$ values around 100, but its computational complexity seems $O(n^2)$ or more than that and far from practical for $n>1000$.
Give the faces the numbers $1,\dots,n$ and let $X_i$ take value $1$ if face $i$ shows up during at least one of the $m$ rolls. Let it $X_i$ take value $0$ otherwise.
Then to be found is the expectation of $X:=\sum_{i=1}^nX_i$.
Applying linearity of expectation and symmetry we find:$$\mathsf EX=\mathsf E\sum_{i=1}^nX_i=\sum_{i=1}^n\mathsf EX_i=n\mathsf EX_1=n\mathsf P(X_1=1)=n(1-\mathsf P(X_1=0))=$$$$n\left(1-\left(1-\frac{1}{n}\right)^m\right)$$