if you roll an x-sided die, n-number of times, what are the chances of getting r-number of repeated values?
if I roll a twenty-sided die, eighty times, what are the chances that I'll roll a six twice and a sixteen twice? ... but NOT specifically six and sixteen; just any value/s repeated, two or more times?
from, say, eight rolls of a twenty-sided die, I get this: {10,12,8,10,2,5,3,5}
what were the chances of getting those two tens, and two fives, in that many rolls?
and the ACTUAL question I have is: what is the calculation to figure out how many 'repeated values' you should expect in n rolls of an x-sided die?
so far, if I do this:
1 - r * (product from i=0,(n-1) of (x-i)/x)
given the example of rolling two repeats - that is, two fives and two tens - from eight throws of a twenty-sided die, like this {10,12,8,10,2,5,3,5} I would put this in wolfram
1 - 2 * ( product from i=0,7 of (20-i)/20 )
https://www.wolframalpha.com/input?i=1+-+2+*+%28+product+from+i%3D0%2C7+of+%2820-i%29%2F20+%29
and get 'basically 60%'
but how do I figure out the chances of any value repeating two or more times, specifically t-number of times?
Use combinations and permutations
One option is to follow the patterns in this blog post (not mine). An excerpt:
(Note that this example uses d10s rather than d20s, but the general principles are there.) It should be possible to derive an algorithm to determine these expressions; however, I have not done so. Rather, I prefer to...
Use a more general-purpose algorithm
I've created a more general-purpose algorithm as part of my Icepool Python library that can solve many dice pool problems; while a bespoke solution as above is probably more efficient, mine is good enough for pool sizes that you're likely to physically roll. You can read a crude explanation of the algorithm here. In short, as long as we can express the pool evaluation as a transition function over (outcome, count) pairs without too large a state space, it's possible to compute the solution reasonably efficiently.
You can try it in your browser using this JupyterLite notebook. Just change the d10s to d20s.
Here's an example output for 8d20:
Denominator: 25600000000
Improving efficiency for more specific queries
If you have a more specific query than enumerating all possible sets of matching sets, you can reduce the state space and improve efficiency by only retaining enough information to compute the answer. For example, if you just want to know the number of pairs:
(This counts e.g. a quadruple as two pairs---if you want unique pairs just replace
//with>=.)