How to generate the pairwise win probability matrix according to the win probability of each competitor?

156 Views Asked by At

For example, I have the win probability vector p = [0.2, 0.5, 0.8] which means the first player wins with a probability of 0.2 against a random player, the player 2 wins with a probability of 0.5 against a random player and so on.

I want to generate a pairwise matrix having each 1vs1 probabilities. I wrote this formula:

$$P(\textrm{A wins against B}) = \frac{P(\textrm{A wins}) \cdot (1 - P(\textrm{B wins}))}{P(\textrm{A wins}) \cdot (1 - P(\textrm{B wins})) + (1 - P(\textrm{A wins})) \cdot P(\textrm{B wins})}$$

So, to generate the matrix, we can use this formula:

$$M_{ij} = \frac{p_i \cdot (1 - p_j)}{p_i \cdot (1 - p_j) + (1 - p_i) \cdot p_j}$$

with p the win prob vector and M the matrix I want to generate.

My question is: what is the right formula?

Because when I empirically try to "prove" the formula, I get results near the expected result (but not exact).

Here the python code of the proof:

import numpy as np
from scipy import stats
import random

def getRandomFloat(min=0.0, max=1.0, decimalMax=2):
    return round(random.uniform(min, max), decimalMax)
def truncateFloat(f, n=2):
    '''Truncates/pads a float f to n decimal places without rounding'''
    s = '{}'.format(f)
    if 'e' in s or 'E' in s:
        return float('{0:.{1}f}'.format(f, n))
    i, p, d = s.partition('.')
    return float('.'.join([i, (d+'0'*n)[:n]]))

def generate_pairwise_win_prob(win_prob, float_precision=None):
    # We create the pairwise win probability (`p_win_prob`):
    p_win_prob = np.zeros((len(win_prob), len(win_prob)))
    w = win_prob
    for i in range(len(win_prob)):
        for j in range(i, len(win_prob)):
            # p_win_prob[i, j] = 1 / (1 + np.exp(w[j] - w[i])) # The Bradley-Terry-Luce model doesn't work
            p_win_prob[i, j] = (w[i] * (1 - w[j])) / (w[i] * (1 - w[j]) + (1 - w[i]) * w[j])
            if float_precision is not None:
                p_win_prob[i, j] = truncateFloat(p_win_prob[i, j], float_precision)
            p_win_prob[j, i] = 1 - p_win_prob[i,j]
    return p_win_prob

def pwp_empirical_proof(win_prob, draw_prob_interval=None):
    p_win_prob = generate_pairwise_win_prob(win_prob)
    victories = [0] * len(win_prob)
    defeats = [0] * len(win_prob)
    for i in range(100000):
        a, b = random.sample(range(len(win_prob)), 2)
        result = match(a, b, p_win_prob, draw_prob_interval=draw_prob_interval)
        if result != 0:
            if result == 1:
                victories[a] += 1
                defeats[b] += 1
            else:
                victories[b] += 1
                defeats[a] += 1
    predicted_win_prob = []
    for i in range(len(win_prob)):
        current = victories[i] / (victories[i] + defeats[i])
        current = truncateFloat(current, 2)
        predicted_win_prob.append(current)
    print("win_prob: " + str(win_prob))
    print("predicted_win_prob: " + str(predicted_win_prob))
    print()

# We define the function that will give the result of a match:
def match(i, j, p_win_prob, draw_prob_interval=None): # draw a comparision from the model
    assert i != j
    rdf = getRandomFloat()
    if draw_prob_interval is not None and abs(p_win_prob[i, j] - rdf) <= draw_prob_interval:
        return 0 # draw
    elif rdf < p_win_prob[i,j]:
        return 1 # i beats j
    else:
        return -1 # j beats i

pwp_empirical_proof([0.2, 0.5, 0.8])

And I get:

win_prob: [0.2, 0.5, 0.8]
predicted_win_prob: [0.12, 0.49, 0.87]
1

There are 1 best solutions below

1
On BEST ANSWER

This is more an extended comment than an answer, party because, as I indicated in my comments, I don't think it is possible to answer the question as given.

As I said, we need more information; the player's winning percentage against competitors at large doesn't determine his probability of beating a particular opponent, even if we know the opponent's winning percentage.

You can make other assumptions to try to compute the probability of A beating B, but you must make sure that it is consistent. The formula you suggest $$P_{AB}=\frac{P_A(1-P_B)}{P_A(1-P_B)+P_B(1-P_A)}$$ (where $P_{AB}$ is the probability that A beats B,) has the property that $P_{AB}+P_{BA}=1$ if $P_{BA}$ is computed according to the analogous formula, which is mandatory if there are no ties.

However, if the players are $A_1,A_2,\dots,A_n$, we also need, for example,

$$P_{A_1}=P_{A_1A_2}+P_{A_1A_3}+\cdots+P_{A_1A_n}\tag1$$ if $A_1$ is equally likely to play any of the other players. Does your formula guarantee this? I doubt it.

We might try to rectify this with a formula like $$P_{A_kA_j}=P_k\frac{P_{A_j}}{\sum_{m\neq k}P_{A_m}}$$

With this definition, equation $(1)$ would be satisfied, but we would no longer have $P_{AB}+P_{BA}=1$.

In short, I can't think of a formula, at least not off the top of my head, that would guarantee consistency of the results. Also, it seems to me that finding such a formula, if one exists, would require making entirely unrealistic assumptions about the problem.