If you like you can recast everything below as a problem about drawing $k$ balls of $M$ different colors from an urn without replacement.
There is a famous trading card game that is abbreviated as YGO. In this game you may select the deck of cards you are using yourself before you play. Naturally, this leads to the overarching question: What cards do I play and in which quantities should I play them? Usually, (but not always) the different types of cards can be written as a partition of the entire deck for the purpose of formulating these questions.
We will be thinking of our opening hand. If we just want to know something like "what is the probability of opening X of card type A" we can use the probability density of the hypergeometric distribution and we can generalize this to more than 2 types of cards using the multivariate hypergeometric distribution.
But the questions that come up are often times significantly more complicated than that. Instead of probabilities of opening hands I want to generalize to expected "payoffs". I'll call this expected card advantage of our opening hand.
One "simple" definition of expected card advantage could be
$$\mathbb E[\max_{i\leq j \leq M} \lambda_{i,j} 1_{\{n_i \geq 1\}} 1_{\{n_j \geq 1\}}].$$
Here we have partitioned our deck into $M$ classes (types of cards). We draw $k$ cards in our opening hand and $n_i$ is the number of cards in our opening hand that are from class $i$.
In game jargon we would say that the probability $\mathbb P(n_i \geq 1, n_j \geq 1)$ is the probability of opening a two-card combo with cards from $i$ and $j$ (although we also include the possibility of one-card combos, since we can have $i=j$). Then $\lambda_{i,j}$ would be the card advantage I would get if I played that combo. Usually, I can only play one combo, which is why I always want to play the one generating the maximal card advantage. Hence, the $\max$.
I'm very ignorant about the literature on urn models. I wonder how or whether we can efficiently compute the expected card advantage. My question is:
Has this or a more general question been studied before and where can I read about that?
We can probably recast this problem into a purely combinatorial problem, although I'm not sure this will make it easier to find the right book.
I wanted to actually give a semi-realistic example of how $\lambda$ could look like. For simplicity I only consider $i\neq j$.
I have $M = 5$ classes with sizes
G: 3, S: 3, F: 3, JJ: 3, T: 28.
Then $\lambda$ is
$$\begin{pmatrix} & G & S & F & JJ & T \\ G & 4 & 6 & 4.7 & 4 & 5 \\ S & 6 & 2 & 6 & 2 & 3 \\ F & 4.7 & 6 & 2.7 & 2.7 & 3.7 \\ JJ & 4 & 2 & 2.7 & 2 & 3 \\ T & 5 & 3 & 3.7 & 3 & 2 \\ \end{pmatrix}$$
As you can see we cannot expect any kind of sparsity in the matrix.
Note: this is very, very rough payoff matrix for the so called Salamangreat deck (Classes: Gazelle, Spinny, Foxy, Jack Jaguar or Falco, and Tech choices). I ignored stuff like Circle, Cynet mining, Lady debug to keep it "simple". I count card advantage gained, including Salads which are live in the GY for the next turn. Techs are considered to always give +1. The trap you search using Gazelle (if you can) counts as a +2. The .7 comes from the approximate probability of using Foxy's first effect to excavate.
With free choice of hand size, the problem is NP-hard
We will reduce the decision version of vertex cover to this.
Then the value of a hand is $1$ iff there is some edge in the graph that is not covered, i.e. both endpoints of the edge were selected to not be in the prospective cover. If there is no vertex cover of the desired size, then every hand is worth $1$ and the expected value of the max combo is $1$; if there is such a vertex cover, then the value for the corresponding hand is $0$ (i.e. no edges were left out) and the expected value of the max combo is less than $1$.
"Single-pass" evaluation
We will want to find some exploitable structure in "typical" combo functions. I recently published a paper that may help, depending on the structure of the payoff matrix (or tensor) $\lambda$.
Consider the following process of evaluating a hand of cards:
You can remember things between each step, and update your memory at each step, such that at the end you can determine the max payoff of the hand. In terms of Python code with a transition function
next_state:Given such a transition function, the output of the algorithm is the distribution of max payoffs of a random hand. From here, the expected value can be computed using the weighted mean.
In the worst case, you would have to remember every previous card you drew, in which case the algorithm has no advantage over iterating over all possible hands with the multivariate hypergeometric formula. However, if you can determine the payoff of a hand while retaining less information than this, the algorithm can find the distribution of payoffs more efficiently.
Examples of exploitable structure
Longest straight
For example, this transition function looks for the longest straight in your hand:
In this case, you don't need to remember every previous card, only the length of the current run and the best run you've seen so far.
Sums
Another example, which may be applicable to "tech" cards, is to simply sum the value of each card:
Ideally the number of possible unique sums is small, e.g. if the payoffs of each tech card is a small integer.
Band matrix
If the payoff matrix $\lambda$ can be permuted into a band matrix with bandwidth $b$ (or analogously for tensors), then you can check for the best combo while only remembering whether the most recent $b$ classes have been drawn. For $p$ unique payoff values, the number of unique states would then be at most $2^b p$.
Count uniques by combo
If the number of combos is small, you could try counting the number of unique cards drawn in each combo:
Then at the end you would determine which (possibly partial) combo gives the best payoff.
Linearity of expectation
If all you are interested in is the expected value, and you can factor out a purely additive component, then you can take advantage of linearity of expectation. For example, suppose you can split the payoff into the sum of:
Then you can sum the expected value of the two components to get the overall expected value. The first one is just the expected tech value per card times the hand size, and the second one can be computed more efficiently since you no longer need the joint distribution with the tech value.
Code example
This uses my Icepool Python library. The idea is to split into three components:
The three components are not independent, but this doesn't matter for the expected value. They could be computed jointly for some extra cost.
There are some differences in payoff matrix compared to the example question; I tried to follow the text rather than the literal numbers provided.
Result:
Base score: Die with denominator 3838380
Mean: 10.65
Inclusive combo score: Die with denominator 3838380
Mean: 0.2760323886639676
Exclusive combos: Die with denominator 3838380
Mean: 0.9430780693938589
Overall mean: 11.869110458057827
The algorithm
The algorithm itself is based on the decomposition of the multivariate hypergeometric formula as a product of binomials. Combined with the transition function formulation above, this allows us to memoize intermediate results at each step, in other words, "what would the state distribution be if I drew only $0 \ldots k$ cards, all of which are class $1 \ldots M$ or lower". For the details, you can read my paper.