I'm pretty sure this is a basic question but I can't seem to find an answer anywhere else.
If I have N objects each given one of M random categories (using a simple uniform random distribution), how do I calculate, on average, the number of categories with at least one object.
I can do this with a brute force method:
SUM m from 1 to M of ((Chance that all objects are in m categories) * m)
This is not feasible for large Ms. Is there an easier way?
Let $K$ be the random variable for number of non-empty categories. Let $P_1,P_2,...P_M$ be the probability that group $M$ is non-empty. By symmetry, $P_1=P_2=...=P_M$, so lets called this shared value $P$. By linearity of expectation, we have
$E(K) = \sum_{i=1}^M P_i = M\cdot P$
We can calculate $P = 1-(1-\frac{1}{M})^N$, hence
$E(K) = M\cdot (1-(1-\frac{1}{M})^N)$.
If this feels like cheating (because the events of different categories being non-empty are dependent), remember that linearity of expectation holds for dependent variables as well.