Bootstrap method & Confidence Interval

Question

Bootstrap method & Confidence Interval

363 Views Asked by Bumbble Comm At 10 May 2026 - 11:48

I'm trying to figure out how this method works. My data:

1000 samples from unknown distribution.
I need to create 40 vectors from those 1000 samples (each vector with 20 samples)
For every one of the 40 vectors, I need to do the bootstrap method for:
- Finding the confidence interval ($\alpha$ = 0.05) in three methods: t, quantiles & normal.
- We need the confidence interval for the standard deviation.

(R langauge)

My way until now:

I've created this 40 vectors (each one with 20 samples)
Let's say that the bootstrap constant is 1000.
What is actually the process of "doing bootstrap" for each vector with 20 samples? How can we create a confidence interval for this vector in each one of these methods I've mentioned?

I will be glad for any help.

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2017-04-07 21:07:31

@BruceET

I cannot comment because i've needed to reset my account. Anyway, I'm interested only in CI for the population standard mean. By 3 methods, I've meant that there are 3 options to calculate the CI (one with quantiles, one with t distribution, and one with normal distribution).

I wanna provide R code, but that's my challange to understand what is the code :)

**Bumbble Comm** · Answer 2 · 2017-04-07 22:01:59

Here is one example of finding a 95% nonparametric bootstrap confidence interval for the population standard deviation (SD) $\sigma,$ based on a sample x of size $n = 20$ from an unknown population distribution.

 x
 [1] 240 314 354 183 321 325 271 273 272 255
[11] 276 250 261 303 348 294 274 254 258 421
s.obs = sd(x);  s.obs
50.67365

If we knew the population distribution, we could find the distribution of the ratio $R = S/\sigma$ based on the population distribution. Then we could find values $L$ and $U$ that cut 2.5% from the lower tails, respectively, of the distribution of $R$ so that

$$0.95 = P(L \le R \le U) = P\left(\frac{S}{U} \le \sigma \le \frac{S}{L}\right),$$

where $S$ is the sample SD of the sample of $n = 20.$ Then the desired 95% CI would be $(S/U,\,S/L).$

However, we do not know the distribution of $R$ and we seek to estimate $L$ and $U$ by using a bootstrap method.

Enering the so-called bootstrap world, we take $B = 1000$ re-samples from x, each of them a re-sample of size $n = 20$ taken with replacement from x. Temporarily, we take the observed SD ($S_{obs} = 50.67365$) as a proxy for the unknown population SD $\sigma;$ that is $\sigma^* = 50.67365.$ Then, for each of the $B$ re-samples, we find $R^* = S^*/\sigma^*.$ We find quantiles .025 and .975 of the $B$ values $R^*$ as estimates $L^*$ of $L$ and $U^*$ of $U,$ respectively. [Notice that quantities referring to re-sampling are denoted by $*$'s.

Back in the real world, we find the 95% nonparametric bootstrap CI of $\sigma$ as $(S_{obs}/U^*, S_{obs}/L^*).$ [Here $S_{obs}$ returns to its original role as the observed SD of our sample x.]

The R code for this procedure follows. In the code we use -re instead of $*$.

B = 1000;  n = length(x);  sg.re = s.obs;  r.re = numeric(B)
for (i in 1:B) {
   x.re = sample(x, n, repl=T);  s.re = sd(x.re)
   r.re[i] = s.re/sg.re  }
L.re = quantile(r.re, .025);  U.re = quantile(r.re, .975)
LCL = s.obs/U.re;  UCL = s.obs/L.re
c(LCL, UCL)
   97.5%     2.5% 
37.33546 88.26224

So the 95% nonparametric bootstrap CI for $\sigma$ is $(37.3,\,88.3).$ Because this is a simulation procedure, subsequent runs may give slightly different results. My second run of the program above gave slightly different results that still round to $(37.3,\,88.3).$

Now it is time for a confession: I generated x from a normal distribution as follows:

set.seed(1234); x = round(rnorm(20, 300, 50))

So I know that the data are normal. The standard 95% CI for $\sigma$ of a normal population is $\left(\sqrt{\frac{(n-1)S^2}{U_q}}, \sqrt{\frac{(n-1)S^2}{L_q}}\right),$ where $L_q$ and $R_q$ are quantiles .025 and .975, respectively, of $\mathsf{Chisq}(\nu = n-1).$ So the traditional parametric 95% CI for $\sigma$ is $(38.5, 70.0)$. Because knowing the population distribution introduces new and useful information into the process of estimation, we cannot expect the normal-based CI to be the same as the bootstrap CI, but they are not much different for practical purposes.

sqrt((n-1)*var(x) / qchisq(c(.975,.025), n-1))
[1] 38.53682 74.01249

Bootstrap method & Confidence Interval

There are 2 best solutions below

Related Questions in STATISTICS

Related Questions in CONFIDENCE-INTERVAL

Related Questions in QUANTILE

Related Questions in BOOTSTRAP-SAMPLING

Trending Questions

Popular # Hahtags

Popular Questions