Basic Bayesian modeling: what tricks are used to obtain computational tractability?

35 Views Asked by At

Suppose there are $N$ people in a two-person game (in my case, Super Smash Bros. Melee), and imagine a simple model where the players have "skill levels" $s_i\in\mathbb{R}$ for $1\leq i\leq N$, and the probability person $i$ will win a game over person $j$ is

$$ \mathbb{P}(i\text{ wins over }j)=\sigma(s_i-s_j) $$

where $\sigma:\mathbb{R}\rightarrow[0,1]$ is some monotonically-increasing function.

Since $\mathbb{P}(j\text{ wins over }i)=1-\mathbb{P}(i\text{ wins over }j)$, the only constraint on $\sigma$ is that

$$ \sigma(x)+\sigma(-x)=1. $$

The logistic sigmoid is one choice satisfying this.

Note the probability of player $i$ having $w$ wins and $l$ losses over player $j$ is binomial-distributed,

$$ \mathbb{P}(i\text{ has }w\text{ wins, }l\text{ losses over }j)=\binom{l+w}{w} \left(1-\sigma \left(s_i-s_j\right)\right){}^l \sigma \left(s_i-s_j\right){}^w. $$

Now suppose further that we've observed a number of matches between players, with $w(i,j)$ and $\ell(i,j)$ the number of times player $i$ won and lost against player $j$. From these matches, I'd like to estimate the $s_i$, which initially are assumed distributed via some uninformative Bayesian prior $\mathbb{P}(s_i,\ldots,s_N)$.

By Bayes Theorem, we can obtain the posterior distribution of the $s_i$ by

$$ \mathbb{P}(s_i,\ldots,s_N|\text{observed matches}) = \frac{\mathbb{P}(\text{observed matches}|s_i,\ldots,s_N)\mathbb{P}(s_i,\ldots,s_N)}{C} \\ =\frac{\mathbb{P}(s_i,\ldots,s_N)}{C}\prod_{i=1}^N\prod_{j=1}^N\binom{l(i,j)+w(i,j)}{w(i,j)} \left(1-\sigma \left(s_i-s_j\right)\right){}^{l(i,j)} \sigma \left(s_i-s_j\right){}^{w(i,j)} $$

where $C=\int_{\mathbb{R}^N}\mathbb{P}(\text{observed matches}|s_i,\ldots,s_N)\mathbb{P}(s_i,\ldots,s_N)\,\mathrm{d}\mathbf{s}$ is a constant normalization factor that I'm ignoring.

I'm stuck here. In my use case, the number of players $N\approx 5000$ and the number of games $\frac{1}{2}\sum_{i,j}w(i,j)+l(i,j)\approx 100000$ (game data obtained via the Smash GG API's history of tournament matches from the past 3 years), and the posterior distribution is not (as best I can tell) separable. As a result, the problem as stated appears computationally intractable.

My question is this:

  • What tricks are commonly used to make high-dimensional Bayesian posteriors like this computationally tractable? Are there approximations I can apply to make it separable?