Pick 5 numbers - calculate probability that respective 4-number combinations appear for $n-th$ time (given probability for their $n-th$ time)?

253 Views Asked by At

We pick 5 unique (distinct) numbers out of [1..36] numbers (like playing a lottery). Each time we pick 5 unique numbers ("play one ticket") thus we select 5 distinct combinations of 4 numbers out of these 5 (let call such a combination - "tuple4").

We repeat picking 5 unique (distinct) numbers ("play one ticket") many times. After that we average how often certain tuple4 appeared.

It turns out that in N plays there were

  1. 21322 (66%) distinct tuple4 that appeared exactly once (1 time)
  2. 8335 (26%) distinct tuple4 that appeared exactly 2 times (none of tuples from item above included here, same applies below)
  3. 2171 (7%) distinct tuple4 that appeared exactly 3 times
  4. 481 (1%) - distinct tuple4 that appeared exactly 4 times

Percentage is taken from total number of tuple4 appearances (that is: $N*5$).

Now we "play" again (pick 5 unique (distinct) numbers) - what is the probability that among 5 distinct tuple4 that sach a play generates:

  1. At least 3 would appear for the first time (thinking of all last N plays), remaining 2 - we don't care (also for the first time or for any $k-th$ time)
  2. Exactly 2 would appear for the first time and any of the remaining 3 would appear for second or third time (but not for first or 4th+ time)

How can I construct such formulas?

P.S. In all 376'992 possible combinations of 5 numbers from [1..36] there are 58'905 possible distinct tuple4. But I think this information and number of plays (N) is not needed. In my case N = 9400.

My thinking was like

At least 3 would appear for the first time (thinking of all last N plays), remaining 2 - we don't care (also for the first time or for any $k-th$ time): $(0.66)*(0.66)*(1-0.66)*((1-0.66)+0.66)*((1-0.66)+0.66) =$ 15%

At least 3 would appear for the first time (thinking of all last N plays), remaining 2 - only for the second time or more (but not for the first time): $(0.66)*(0.66)*(1-0.66)*((1-0.66))*((1-0.66)) = $ 1,7%

But these results don't seem to reflect what happens really (I did a good random-number computer simulation), so I think I made a mistake somewhere.