How to compute probability of a bootstrap sample

1.2k Views Asked by At

The Question

Consider the samples $\{1, 3, 4, 6\}$ from some distribution.

a) For one random bootstrap sample, find the probability that the mean is $1$.

b) For one random bootstrap sample, find the probability that the maximum is $6$.

c) For one random bootstrap sample, find the probability that exactly two elements in the sample are less than $2$.

My Understanding

We just started to learn the bootstrap in class and I came across this question. I'm a little confused, as I feel like this question is too easy, as the mean of any sample with those numbers is always $3.5$ so a) is $0$. The maximum will always be $6$ so b) is $1$. And $2$ of the numbers cannot be less than $2$. So c) is $0.$

Is there something major that I'm missing?

2

There are 2 best solutions below

2
On

My understanding is that you should take a random sample, presumably of size 4, of those four values with replacement. This means that they need not all appear, in which case the probabilities you are asked to find will not be those got from the given sample.

0
On

I think this is a very thoughtful question, leading up to an understanding of how bootstrapping works. Your attempted answers are not exactly on target, so here is my attempt to clarify. Because bootstrapping is a simulation-based, I begin with simulation results before showing exact binomial probabilities for each part.

Simulation: One bootstrap sample will take four values at random with replacement from the set $\{1,3,4,6\}.$ Let's simulate many re-samples and see what happens. With a million bootstrap samples simulated probabilities should be accurate to two or three places.

(a) Average of four is $1;$ (b) max is $6;$ (c) need exactly two $1$'s.

set.seed(1)
x = c(1,3,4,6)

a = replicate(10^6, mean(sample(x, 4, rep=T)))
mean(a == 1)
[1] 0.003916  # aprx  0.00390625

w = replicate(10^6, max(sample(x, 4, rep=T)))
mean(w == 6)
[1] 0.683426  # aprx 0.6835938

nr.ones = replicate(10^6, sum(sample(x, 4, rep=T)==1))
mean(nr.ones==2)
[1] 0.210837  # aprx 0.2109375

So the respective probabilities for parts (a)-(c) are approximately $0.004, 0.684,$ and $0.211.$

Exact binomial probabilities: Exact probabilities can be found using the binomial distribution. Exact probabilities computed using R, where dbinom is a binomial PDF and pbinom is a binomial CDF. You can easily use the appropriate binomial PDF formula to do the computations.

(a) Ones are successes. The number of ones in four draws is $X_1 = \mathsf{Binom}(n=4, p = 1.4).$ In order for the average to be $1,$ we need all ones. $P(X_1 = 4) = 0.0039.$

dbinom(4, 4, 1/4)
[1] 0.00390625

(b) Sixes are successes. In order for the max to be $6,$ we need at least one six. $X_2 = \mathsf{Binom}(4,1/4),$ $P(X_2 \ge 1) = 0.6836.$

sum(dbinom(1:4, 4, 1/4))
[1] 0.6835938

1 - dbinom(0, 4, 1/4)
[1] 0.6835938

(c) Values below $2$ are successes. Only ones are smaller. So we need exactly two Ones: Probability is $0.2109.$

dbinom(2, 4, 1/4)
[1] 0.2109375