Modelling question regarding reward functions in Bitcoin mining pools

55 Views Asked by At

I'm currently trying to reproduce the simulation results from Roughgarden et al. (Stanford University), i.e. his paper from 2017 named "Incentive Compatibility of Bitcoin Mining Pool Reward Functions" (link: https://www.ifca.ai/fc16/preproceedings/28_Schrijvers.pdf, pages 13 and 14).

This is a follow-up to my already answered question about the basic modelling of the 99th percentile payout for the case of solo mining. Especially, I want to reproduce the results in said paper on page 14, i.e. Roughgarden et al.

Now in the model of the paper we are acting as a bitcoin pool manager who tries to distribute the bitcoin block rewards among the participants in the pool regarding the computational power they brought into the system. Because the pool operators don't know what exactly that amount of computational power the individual miners brought into the pool is, they have to estimate it. This is done by miners participating in the pool committing so called "shares" (= partial solutions) to the pool operator which are computationally easier to solve than the actual full solution. This is manipulation-proof in the sense that a miner can't find partial solutions along the way if he doesn't try to solve the actual cryptographic "puzzle" and vice versa. Now the question is: how does the pool operator distribute the rewards among the miners participating in the pool?

Like I already described in my previous question, most of the mining pools are using one of the two reward functions of either the "proportional" or "pay-per-last-n-shares" (sliding window of the last N committed shares in the pool). Because of incentive compatibility issues regarding the behaviour of the miners to mine honestly (for example, they could decide not to release their partial solutions immediately after they found them for an increase in the expected payout) Roughgarden et al. introduced a new incentive compatible function.

The setting used is

  • from a game theoretic/mechanism design perspective, we see "one round" as one game, i.e. the end of the solving of one computational puzzle until the next solution of the puzzle.
  • $\alpha_i $ = an individual miner i's computational capacity (which the pool operator has to estimate). Our assumption for reproducing the results is $\alpha_i = 0.001$.
  • $D$ = each partial solution constitutes a full solution with prob. $\frac{1}{D}$. It's important to stress out that this is the probability for a share being a full solution, not for finding a share a priori. In our assumptions this is $D = 1,000,000$.
  • $b_i$ = the number of shares (= partial solutions) committed to the pool operator during by miner i.
  • a unit of time corresponds to the expected time it takes for all miners combined to find a full solution (in reality this is about 10 minutes) and we normalize the reward for finding a full solution to be B 1 (in reality this currently about B 6.25, although it changes over time).

Now my problem is: how do I reproduce the results? Do I have to run several rounds/different players/threads to model the interaction among the players? This especially relates to rounds which can be terminated very quickly if one of the miners finds a solution relatively early after the start of a new round. Also it is not clear to me what is the trick to circumvent the problem that we don't know (and it's not specified) what the probability of finding a partial solution per se is, only what the probability for a share constituting a full solution is mentioned. My guess is to work with expected values (because we're assuming only one pool, so the total probability for miner i to find the next full solution is $\alpha_i$).

The definitions of the proportional and incentive compatible methods are mentioned to give a better overview:

  • The proportional reward function is $R_i^{(prop)}(\boldsymbol{b}) = \frac{\boldsymbol{b_i}}{K}$, where $K$ is the sum of all shares committed by the miners in the pool in the respective round, i.e. $K = \| \boldsymbol{b} \|_1 = \sum_{i = 1}^ n \boldsymbol{b_i}$. This method in the end has an expected value for each individual miner participating in the pool of exactly $\alpha_i$, the true mining capacity of an individual miner.
  • The "Incentive Compatible"-Reward function suggested by Roughgarden et al. is $R_i^{(ic)}: \mathrm{N}^n \times \{1, ..., N\} \to [0, 1]$, \begin{gather*} R_i^{(ic)}(\boldsymbol{b}, s) = \frac{b_i}{\max\{\| \boldsymbol{b}\|_1, D\}} + 1 \{i = s\} \cdot \Big(1 - \frac{\|\boldsymbol{b}\|_1}{\max\{\| \boldsymbol{b}\|_1, D\}}\Big), \end{gather*}where 1 {i = s} denotes the indicator function. One can show analytically that this reward function has better properties in regards to forcing the participating miners to act more honestly.

To sum it up: how can I compute the percentiles and CDF for the three different methods? I'm thinking way too difficult here, because my approach until right now was to model different players and compute the expectation over several thousand rounds. There has to be an easier probabilistic approach.

Thank you for helping me out. Best wishes from Germany.