Win/Lose ratios and selection strategies

107 Views Asked by At

Imagine the following scenario:

You're on a TCG tournament which allowed you to bring N decks with you. After each game, you might select another deck for your next game. You are allowed to keep score of each deck. Of course, some deck(s) will perform better in that day's meta than others. Your goal, after each match, is to try and select the best deck for the next one.

No, this is not an assignment, just something I've been thinking of lately.

For example, let's say I have 3 decks, A, B, and C.

My 1st selection strategy would be something like this:
Switch to a random other deck each time you lose. Otherwise, keep playing with the same deck.

Example: In my 1st game, I win with deck A, so I keep it for my 2nd game. I win again, and I keep it for my 3rd game. I lose my third game, so I get to pick randomly (50-50 chance) between deck B and C.

So, then I decided to refine it a bit, so here's my 2nd attempt: If you win, keep playing with that deck. If you lose, do a roulette wheel selection of the remaining 2 deck, where weights are normalized win ratios of each deck.

So, let's imagine this situation:
Deck A: 5W/5L => 0.5 ratio
Deck B: 3W/5L => 0.375 ratio
Deck C: 2W/5L => 0.286 ratio

I'm currently playing with deck A and I lose. So, I get to pick between deck B and C. Their normalized weights are 0.567 for deck B and 0.433 for deck C. That means, on a selection, deck B has 56.7% chance to be selected, and deck C 43.3% chance.

For my 3rd attempt, I asked myself this question: what if deck A is really superior in this meta, shouldn't I include it as well in selection?

If so (using example from 2), the chances for each deck being selected:

Deck A: 43.1%
Deck B: 32.3%
Deck C: 24.6%

That makes sense, right, using overall win ratios?

Well, not quite. It would work OK if we had the same amount of data for each deck. That made me go further.

Again, some win/lose ratios:

Deck A: 2W/1L = 0.67
Deck B: 20W/20L = 0.5
Deck C: 49W/51L = 0.49

Now, as you can see, the data is skewed. By our last method, here are the chances for selecting each deck:

Deck A: 40.4%
Deck B: 30.1%
Deck C: 29.5%

Now, I don't find that too good, because the amount of data on deck A is missing. Yes, it might be a great deck, but decks B and C have shown some good performance over much more games than deck A.

So, what I really am looking for is a function that will take in consideration the amount of games played as well as win ratio. Here are the guidelines:

If deck has high win ratio, and high play count relative to others, mark it good
If deck has high win ratio, but low play count relative to others, mark it possibly good
If deck has low win ratio, and high play count relative to others, mark it crap
If deck has low win ratio, but low play count relative to others, mark it unknown, but worse than high win ratio, low play count

So, the function should use the relative play count to reinforce good win ratios, and discourage decks with low win ratios, but not exclude them entirely. Also, it should give decks with low relative play count a chance, but not over already proven deck.

Is there a (simple) function for this kind of assessment?

1

There are 1 best solutions below

0
On BEST ANSWER

You are describing the multi-armed bandit problem. You have $N$ slot machines (decks), each with some unknown expected payoff (win %). You want to maximize your payoff, which demands a careful balance between exploration (gathering data about each slot machine/deck) and exploitation (using the slot machine/deck that appears to be best so far). There are a variety of high-level strategies for this problem, which are described in the link.