I have a question concerning probability-distributions. This is for some an App I am developing in my spare time. A few years ago, during my school-time, I would be able to solve it on my own but sadly not anymore. So I kindly ask for your help ;)
I have a (Zipf) distribution function looking like this: $$ f(k) = \frac {(1/(k^{1.1}))} {\sum_{n=1}^{30} (1/n^{1.1})}, 0 < k <= 30 $$
This is a modified example from here: http://en.wikipedia.org/wiki/Zipf%27s_law#Theoretical_review, with parameter s = 1.1 and N = 30.
To my question: I draw, lets say 15, random numbers between 1 and 30 (value of number = rank) so I get a result like this [1,6,3,27,2,6,7,1,22,15,9,19,4,8,12].
I want to calculate the probability that, lets say, 10 different values are in this array or in other words, I want the possibility that I draw 10 different values when I draw 15 times with this distribution function.
I would appreciate any help very much. Feel free to use other parameters instead of 1.1/30/15/10 - I am not interested in the exact result or something, but in the way to calculate it. Hints and links to useful information is always greatly appreciated.
I'd approach it this way. Firstly, there are $30\choose10$ ways to select these unique 10 numbers out of the $30^{10}$ total possible strings of length 10.
If we let $T = \left(t_1, t_2,\ldots,t_{10}\right), t_i \in {1, 2,\ldots,30}, \forall_{i\neq j}\;t_i \neq t_j$, then the probability of seeing observation set $T$ should be:$$\prod_{i=1}^{10}\frac{t_i^{-1.1}}{\sum_{n=1}^{30}n^{-1.1}}$$
Unfortunately, the probability of seeing a particular $T$ depends on the members in that set, so I'm not sure of a shortcut getting around calculating the probability of each of the possible 30,045,015 strings of the numerator.
Edit: I just saw I misread the question for all 10 different, instead of 10 out of 15. The logic should then be${30\choose{10}} \cdot {10\choose5}$ possible $T$ strings where 10 of the $t_i$'s are unique and 5 are repeats. Once the $t_i$'s are identified, the probability of a specific $T$ string should be:$$\prod_{i=1}^{15}\frac{t_i^{-1.1}}{\sum_{n=1}^{30}n^{-1.1}}$$ This does mean there are now 7,571,343,780 unique possibilities.
Edit2: After thinking about it a bit more, I think the probability products should not be multiplied by the initial factorial factor. The order in which the "letters" of the string are selected is irrelevant assuming the draws are independent, so I've deleted them.