I don't imagine this is a very hard or special problem. Similar questions have probably been asked many times, but I don't know how to solve it or what to search for to find the answer.
Given an infinite supply of, say, Scrabble tiles, with an equal probability of drawing any letter of the alphabet, and supposing I draw n tiles. How many groups of two or more of the same letters would I expect to find?
For example suppose I draw 300 tiles at random off the factory conveyor belt, group the letters together so that all A's are in one pile and B's are in another pile and so on, and then discard all lone tiles, how many piles would I have?
What is the general formula for this, given a variable alphabet size too?
I realize there is a chance of drawing nothing but A's and getting only one pile, or on the other hand drawing 2 or more of every letter and having 26 piles. But I figure there is a way to answer the question by averaging the probabilities of having 1,2,3...26 piles and coming up with an answer that on average you will have 6.37 piles, or whatever.
For my purposes I suppose it would be nice to visualize the whole bell curve of probabilities of every number of piles, to know whether the answer to my main question is highly likely or just vaguely likely. I can intuit that as more and more tiles are drawn, the number of piles approaches 26 (or whatever size the 'alphabet' is), but how exactly?
This sort of problem is best solved by making use of the linearity of expectation.
Each letter contributes to the expected number of piles with the probability that a pile exists for that letter. This is the probability that you get two or more instances of that letter. If you have $n$ tiles and $m$ letters, the probability to get two or more instances of a given letter is
$$ 1-\left(1-\frac1m\right)^n-\frac nm\left(1-\frac1m\right)^{n-1}\;, $$
the complement of the probability of getting either zero or one instances. Then by the linearity of expectation, the expected number of piles is just $m$ times this probability:
$$ m-m\left(1-\frac1m\right)^n-n\left(1-\frac1m\right)^{n-1}\\ = m-(m+n-1)\left(1-\frac1m\right)^{n-1} \;. $$
For $n=300$ and $m=26$, this comes out as
$$ 26-325\left(\frac{25}{26}\right)^{299}\approx25.9974=26\left(1-10^{-4}\right)\;, $$
so you'd expect to have all the piles almost all the time.
You can compare this to the solution of the coupon collector's problem, which states that the expected number of draws to draw each letter at least once is $mH_m$, where $H_m$ is the $m$-th harmonic number. In your example, $26H_{26}\approx100$.