Can I, by looking at scores from a game, decide how much luck and how much skill is involved?

110 Views Asked by At

My wife and I play a lot of rummy against each other, and we keep a record of the score for each game, as well as the accumulated score. Since I have a substantial lead at his point, I claim that I'm a better rummy player than she is, while she argues that it's pure luck.

Most people agree that rummy is a game that combines luck and skill, but for many other games it can be hard to decide.

Assume we have the following list of scores from a game, but we know nothing about the game.

Round   :  1   2   3   4   5   6   7   8   9  10  | Sum  Avg Wins
Player A:  -   -   5   2   -   -  17   -   -   -  |  24  2.4   3
Player B: 12   3   -   -   8   4   -   5  17   4  |  53  5.3   7

Is it possible to examine the scores and decide how much luck/skill is involved? It seems reasonable to me that both the total points and number of wins are important, but are there also other factors?

1

There are 1 best solutions below

0
On BEST ANSWER

The simplest analysis is for the number of wins. Let $n$ be the number of independent games, and $X$ be the number of wins for Player B. Then under the null hypothesis that the players are equally likely to win any one game, $X \sim \mathsf{Binom}(n, 1/2)$.

If B is in the lead with $x > n/2$ observed wins, we would reject the null hypothesis (and conclude that B is more skillful), if $P(X \ge x)$ is remarkably small. Smaller the 0.025 might be a reasonable criterion.

In your case, $n = 10$ and $P(X \ge 7) = 1 - P(X \le 6) = 0.1719,$ which is not remarkably small. (The computation in R statistical software is shown below.)

1 - pbinom(6, 10, .5)
## 0.171875

If we had $x = 9$ then $P(X \ge x) = 0.0107,$ which might be persuasive to you or a neutral party, but possibly not to your wife who starts with a deeply held belief that rummy is a game of chance.

It is not the proportion of wins that governs the outcome. If Player B had $x = 35$ observed wins in $n = 50$ games, then the computation would be $P(X \ge 35) = 1 - P(X \le 34) = 0.0033,$ a persuasive result. So if you maintain your proportionate lead for a larger number of games, then you have a good case that you are more skillful.

If you want to look in a statistics textbook under 'one-sample binomial test', you can find a more-technical explanation. Also, perhaps something about a normal approximation, which might make sense for $n = 50,$ but not for as few as $n = 10$ games.


Using scores would require a knowledge of rummy that I do not have. If the winner has score $12,$ does it make sense to say that the loser has score $-12?$ If so, here is a formal statistical test.

I find it hard to believe that these scores are normal, so I'm using a one-sample Wilcoxon ('signed-rank') test of the hypothesis that the population median score is $0$ against the two sided alternative that it is not. This test does not require normal data.

The results from R statistical software are shown below. To be persuasive, one would need the P-value to be less than about 0.05. (The warning message has to do with the two 4's in the data, and the fact that the sample median is 4; I investigated this and found that the exact P-value must be above 0.20.)

 y = c(12,3,-5,-2,8,4,-17,5,17,4)
 wilcox.test(y)


        Wilcoxon signed rank test with continuity correction

data:  y 
V = 39, p-value = 0.2613
alternative hypothesis: true location is not equal to 0 

Warning message:
In wilcox.test.default(y) : cannot compute exact p-value with ties

Again here, it is possible that data from more games with a continuing difference between players would yield significant results.