Standard Deviation of 26 Cards

191 Views Asked by At

What is the standard deviation for 26 cards drawn from a standard deck (each card's value ranges between 1 and 13)? Assume selection is with replacement.

I saw someone approach this problem by summing the variance for one card 26 times and then taking the square root. However, isn't this approach wrong as we are looking for the standard deviation of the 26 cards and not of the sum of the 26 cards?

Here is my approach by using the mean of the sum of the 26 cards:

Var(X) = Var((X1 + X2 + ... X26)/26) = (1/26^2) (26 * Var(X1)) = (1/26^2) (26 * 14)

Var(X) = 14/26. This is wrong, however.

1

There are 1 best solutions below

2
On

I suspect you may be confusing the variance of the sample mean with the expectation of the sample variance.

The variance for the value of one card drawn uniformly from $\{1,2,3,4,5,6,7,8,9,10,11,12,13\}$ with equal probability is $\frac{13^2-1}{12}=14$.

So drawing $26$ cards independently with replacement, the variance of the sample sum sum is $26 \times 14=364$ and the variance of the sample mean is $\frac{14}{26} \approx 0.538$. The standard deviations are the square roots of these.

If instead you are interested in the expectation of the sample variance then, if you are using Bessel's correction it is $14$, and if you do not then it is $14 \times \frac{26-1}{26}\approx 13.46$. The expected standard deviation is not the square root of this but slightly less, as square root is not a linear function.

Here is a simulation in R demonstrating this (with a little simulation noise):

set.seed(2023)
samplesome <- function(values, samplesize, replacement){
   x <- sample(values, samplesize, replace=replacement)
   return(c(sum(x), mean(x), var(x), sd(x)))
   }
sims <- replicate(10^5, samplesome(rep(1:13,4), 26, TRUE))
var(sims[1,])   # variance of sample sums
# 362.7971
var(sims[2,])   # variance of sample means
# 0.536682
mean(sims[3,])  # mean of sample variances
# 14.00217
mean(sims[4,])  # mean of sample standard deviations
# 3.726031

An interesting point is that sampling half the deck without replacement, you roughly halve the variance of the sample sum and sample mean (they become $26\times 14\times\frac{52-26}{52-1}\approx 185.6$ and $\frac{14}{26}\times\frac{52-26}{52-1}\approx 0.275$) while the sample variance would appear to increase slightly (Bessel's correction is too strong in this case, and clearly if you sampled the whole deck the sample sum and sample mean would have zero variance while the sample variance with no correction would be exactly $14$). Replacing TRUE in the code above by FALSE would give

var(sims[1,])   # variance of sample sums
# 185.9986
var(sims[2,])   # variance of sample means
# 0.2751459
mean(sims[3,])  # mean of sample variances
# 14.27443
mean(sims[4,])  # mean of sample standard deviations
# 3.770068