$2$ people (C and D) decide to play a computer assisted game. The computer is programmed to quickly play as many fair hands (using the equivalent of a fair $52$ card deck), until someone wins the hand so the win will appear instantly when they press a button on the keyboard and then it waits for the next keypress for the next game. The computer is necessary because of how infrequent wins are, thus a real deck of cards would take way too long between wins.
The rules are player C can win if he gets $3$ "gapped" $3$ card straights dealt in order. An A (ace) will be considered the highest rank for both C and D so $A23~567~9TJ$ (T = ten rank) would NOT be a win for C but $234~QKA~789$ would be. There has to be at least a gap of $1$ rank between the $3$ card straights but can be wider as shown in the 2nd example. Note that the individual straights of length $3$ have to be dealt in order but the actual $3$ length $3$ straights don't need to be in relative order (as shown by $234QKA789$). That is still a win for C and need not be $234789QKA$ (although that would also be a win for C). The simplest example of a win for C would be something like $234678TJQ$. This is also the lowest ranked win for C. The highest rank win for C would be $45689TQKA$. Remember permutations are allowed such as $89TQKA456$ is also a win but $243678TJQ$ is not. If you think of each gapped $3$ card straight as a letter. Such as with $234678TJQ$, X=$234$, Y=$678$, Z=$TJQ$, then valid permutations are XYZ, XZY, YXZ, YZX, ZXY, and ZYX.
For player D to win, $2$ pattern classes are allowed. Either any $6$ card straight (such as $345678$) or this pattern: $223344$, $334455$, $556677$... $QQKKAA$.
The computer will keep dealing community (shared) cards until either there is a winner of the hand (not likely) or all $52$ cards are depleted (very likely). The cards are reshuffled after each win and after each nonwin (all $52$ cards are depleted). Permutations are NOT allowed. The straights and patterns MUST be dealt in order. For example, $345678$ is a win for D but $357468$ is not, even though they could be the same exact cards. Suits are irrelevant for the straights and other winning patterns. Any suits are allowed.
So it is a rainy day and they play many hundreds of winning hands to see who gets more wins. The question is who has the mathematical advantage and by how much? A hand that ties can be ignored as they will not count that hand and will continue play. Neither player has any knowledge of the expected probability and both just play on a "hunch" based on the observed patterns to win which they are told about.
The $300$ bounty ends Monday night (Aug 22nd) around 11:59PM (Eastern time) already including the $24$ hour grace period. Other simulations are encouraged but also math solutions or partial math solutions (like how to set it up properly).
Sample data of $3$ million random shuffles can be downloaded here for test purposes and/or analysis for mathematical solutions.:
https://drive.google.com/file/d/0BweDAVsuCEM1amhsNmFITnEwd2s/view
Format is $52$ bytes of data per line and $2$ more bytes for the end of line so $54$ bytes total per line, $3$ million lines. "$2$" = rank $2$, "$3$" = rank $3$... "T" = rank $10$.... All cards are represented by this set of ranks: "$23456789TJQKA$". There are $4$ of each rank in each shuffle (suits are irrelevant in this card game).
Where are the mathematical attempts at solving the probabilities of this card game? How about even some partial solutions like take the winning D patterns and compute the probability of a D win if D was the only player, subtracting out the overcounting cases where $2$ (or more) of those patterns can appear in the same shuffled deck. For example, if we get something like $234567223344$

Since the winning hands are so rare, I thought about generating them directly, separately for player C and D.
The chances of player D
For a fixed set of six adjacent positions, we already know the odds: the probability of seeing D's winning pattern 1 is $$ P_{D_1}=8\prod_{i=0}^5\frac{4}{52-i},$$ for pattern 2 it is $$ P_{D_2}=11\prod_{i=0}^2\frac{4\cdot 3}{(52-2i)(52-2i-1)}, $$ and the probability for any winning pattern for D at these positions is $$ P_D=P_{D_1}+P_{D_2}. $$
To create a representative winning hand, we first select between winning patterns 1 and 2, respecting the ratio $P_{D_1}:P_{D_2}=512:297$. Next we choose between the 8 or 11 possible ways to assign cards according to the selected pattern at the positions considered, they all have the same probability. Then, we randomly distribute the remaining cards.
Since we don't want to only create winning hands which show a winning pattern at fixed positions, we also want to randomly select these positions. Up to now, nothing depended on that selection, so we can defer determining these positions to this moment. We shuffle the remaing cards together with a special marker that after shuffling is replaced by the cards forming the selected winning pattern. This way we create each pair of a winning hand and six adjacent positions that show a winning pattern in that hand with the same probability. If we forget about the selected winning positions, we are overcounting the winning hands that show $n>1$ winning patterns. To compensate, we could weight these hands with $1/n$. Alternatively, we can reject a winning hand that has a winning pattern before the selected one. This avoids having to use rationals or floating point numbers. I found that I can keep more than 95% of the generated winning hands, so I did that. Now we can generate representative candidate wins for D, next we can test if it is a win for C after all and also reject in this case (after counting it as an interesting outcome).
So what can we do with a stream of representative winning D hands? One thing is to determine the average number of cards needed until it is a win for D. I get 28.963 (after $10^{11}$ experiments, where an experiment consists of finding a winning hand that may then be rejected).
It is also possible to count how often a particular number of cards was needed and compare that to the numbers expected from a uniform distribution:
But there is more! We can use the observed ratio of all D's wins to the wins after the first six cards as an experimentally justified multiplier for $P_D$ (which is the exact probability for D winning after six cards), replacing the rough estimated multiplier 47 that came from the assumption of a nearly uniform distribution. We get a multiplier of 44.66 and D's winning chances as 0.0158%.
The chances of player C
We can use essentially the same idea for player C. The probability of seeing a winning pattern at three triples of adjacent positions is already known to be $$ P_C=3!\cdot 10\prod_{i=0}^8 \frac{4}{52-i} $$ (that calculation only needs that each pattern demands nine different particular cards at each of nine chosen positions).
We select one of the ten patterns and shuffle the remainig cards together with three special marks. Rejecting hands to avoid overcounting is slightly more difficult, because it is not enough to compare the last position needed to see that C is winning, because we might still use different straight first and second triples. However, we can just use whatever triples our algorithm finds as representative for one hand, and check if this agrees with the positions we used to create the winning hand. This time, we have to reject more hands, but we still can keep more than 85%. The following results are again after $10^{11}$ experiments.
The average number of cards needed to see a winning pattern is 41.10, the distribution is very different:
It is a problem that the first value is so relatively close to zero, because it makes the computation of the multiplier less stable. (By comparing $P_D$ and $P_C$, we could have known that before.) After $10^5$ experiments (I didn't look at smaller values), the estimated multiplier for the player D case already had two good digits, while the estimate for the player C case was 50% off. That is no surprise, given that only four occurences of wins after 9 nine cards have been observed. After the $10^{11}$ experiments, 6581830 such wins were observed. The estimated multiplier is 13000 (with three plausible digits), giving a winning chance for C of 0.0153%.
More results
While we create 63.288% of the D wins as pattern 1 wins, only 61.565% percents of the surviving D win hands belong to pattern 1. The ratio of real ties to candidate ties (which can be determined from both simulations) is about 1:31.
Using the observed ratio of real ties and wins at the first possible position, we can again use $P_D$ resp. $P_C$ as "leverage", this time to get the expected frequency of real ties. The D winning data predict one real tie in 43974812 hands, the C winning data predict one in 43961799 hands. The result from the D wins is again more relyable, but seeing both estimates are so close is reassuring.
The program
The program to perform the experiments is written in C++11, compilable with
g++from both Debian stable and oldstable with the option-std=c++11. It is too long to post it here, but if anyone is interested in it (or the detailed results) I will find a way to transfer it. Performing $10^{11}$ experiments took about 32 hours.Getting better results faster
I found a better way to use my method of getting representative winning hands. If we multiply $P_D$ by 47, we get the sum of the probability of getting a deck that has a winning pattern in the first possible position, and the propability of getting one that has it in the second position, and so on. These events are not disjoint, we are overcounting the decks that satisfy the winning condition in more than one way: we count them twice if they have two winning patterns, three times if they have three etc.
We can estimate the factor by which we are overcounting by looking closely at the randomly created (candidate) winning decks. For all the created winning decks, we sum the number of ways in which it is winning, and divide that by the number of decks we have created. Dividing $47P_D$ by this factor gives D's single player winning chance. If we want to know his chance in the two player version, we have to correct by a factor that we can also estimate: the number of real wins divided by the number of candidate wins. (The candidate wins cancel.) This method is better because we don't depend on an estimate for the number of wins in the first possible position of which there are almost two orders of magnitude less than the number of experiments we have performed. We already get three significant digits after $10^5$ experiments. As single player winning probability we get 0.015787%, for the two player variant we get 0.015776%.
We can do the same for player C with an even better improvement in accuracy, but first we have to find the equivalent of the factor 47. It is the number of ways in wich we can find three disjoint triples of adjacent positions of the 52 cards in the deck. This number is 15180. (We could get this by stars and bars, but I already had a list of them for another experiment: If you take all these possibilities and count how often each position appears as last selected position, the result looks very much like the distribution of the number of cards needed for a C win shown above.) So $15180P_C$ is morally as good an estimate for C's chances as $47P_D$ is for D's, however it is less accurate, it is almost 17% off. Estimating the correction factor again gives three significant digits after only $10^5$ experiments. Eventually, we get 0.015332% as C's single player winning chance and 0.015271% for the two player variant.
(I missed a simpler way to estimate the overcounting factor. We just need to divide the number of experiments by the number of winning decks that we keep as representative. This avoids having to find the number of ways in which a deck is winning, which is slightly tricky for C. However, being able to do it will be useful in the next section.)
Strict bounds for the single player variants
(I'll add this part later. It's nice, but not enough to really prove that D's chances are better.)