Difference and confidence intervals

Question

Difference and confidence intervals

77 Views Asked by Bumbble Comm At 26 Mar 2026 - 11:24

I performed a few series of simulation to evaluate values of two parameters. Let's say, the results can be presented like this:

SIM 1
X:      1.0 2.0 3.0 4.0 etc.
FA(X):  1.1 2.3 4.5 1.1 etc.
FB(X):  1.2 2.1 3.2 2.3 etc.

SIM 2
X:      1.0 2.0 3.0 4.0 etc.
FA(X):  1.3 2.2 4.3 1.1 etc.
FB(X):  1.2 2.1 3.2 2.4 etc.

SIM 3
X:      1.0 2.0 3.0 4.0 etc.
FA(X):  1.1 2.3 4.5 1.4 etc.
FB(X):  1.5 2.2 3.3 2.3 etc.

Now, I'm willing to calculate difference between those F's and its confidence interval. So the desired result would be like "difference between A and B is below 0.1 with confidence 95 %".

I was thinking about calculating max(abs(FA - FB)) for every set of data and then using those numbers to calculate mean value and one-sided confidence interval. But actually I'm not certain it's the right method. Could anyone point out a method of doing such things?

Any help appreciated. Thank you.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

Your explanation is a bit vague for a precisely targeted response, but you are flirting with ideas that are really important in modern data analysis. In particular, it seems mentioning two well-established inferential methods that use simulation methods somewhat similar to yours. I will discuss one in some detail and mention the other.

Permutation Test. Suppose you have samples of size 15 from two different populations and want to test whether one tends to give larger values than the other. Also suppose that standard 2-sample tests are inappropriate (perhaps too many tied observations for a Wilcoxon signed rank test, and data obviously too far from normal for a 2-sample t-test).

The first thing you need to do is to decide on a "metric"--a way to measure differences between two sets of data. The simplest is the difference between the two sample means. (You might pick the difference between two sample medians, the Wilcoxon statistic, or the t-statistic. The latter two are not necessarily bad metrics just because circumstances may cause their actual distributions to be different from the published ones.) We begin by finding the two sample means and then the difference between them and call this 'd.obs'--for observed difference. This is a benchmark for comparison with what comes later.

If the two populations really are equal, then the two samples of 15 observations are essentially equivalent. We combine them into one sample of 30, permute the sample at random, and pick the first 15 as a first permuted sample and the second 15 as a second permuted sample. For the first iteration we find the difference between the means of the two permuted samples, and call it 'd.perm[1]'.

There are C(30, 15) = 155,117,520 ways to split the 30 into two samples of 15, and we can't possibly look at them all. If we could, we could look at the distribution of all the d.perm's and see whether d.obs is far out in its right tail, leading us to reject the null hypothesis and conclude that the first population tends to give larger values than the second; otherwise decide the data at hand don't have enough information to come to that conclusion.

While we can't look at all C(30, 15) splits, we can use a simulation program in R to look at a large enough number of them to do the job. This gives us a 'simulated permutation distribution'. The program below does 100,000 permutations (splits) and records the metric 'd.perm[i]' for each. The proportion of values of the metric to the right of d.obs is called the P-value. If it is smaller than 5% we reject the null hypothesis.

For the (fake) data shown in the program the means are 0.923 and 0.893, giving d.obs=0.03. The program gives slightly different answers on each run. One run gave a P-value of 0.001 rounded to three places. the last line of code makes a relevant graph (not shown here). [For one elementary introduction to other kinds of permutation tests, see Eudey, Kerr, and Trumbo (J. Stat. Educ. V18 #1 (2010), and you can do a web search for more.]



a = c(.88,.96,.92,.94,.93,.90,.95,.92,.89,.94,.92,.95,.93,.91,.91)
b = c(.87,.94,.87,.95,.86,.88,.89,.91,.86,.92,.89,.90,.89,.88,.89)
d.obs = mean(a) - mean(b);  n1 = length(a);  n2 = length(b)
m = 10^5;  d.perm = numeric(m);  all = c(a,b)
for(i in 1:m) {perm = sample(all, n1+n2)  # random permutation
d.perm[i] = mean(perm[1:n1]) - mean(perm[(n1+1):(n1+n2)]) }
d.obs;  mean(d.perm > d.obs)  # P.value
hist(d.perm, prob=T); abline(v=d.obs, col="red")

Bootstrap Confidence Intervals. This is a simulation method for finding confidence intervals in situations where standard methods do not work well. One difference from permutation tests is that the data are repeatedly 're-sampled' WITH replacement. [There is rich literature on this topic and you can get a good start from the bibliography in Wikipedia on 'bootstrapping (statistics)'.]

Difference and confidence intervals

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in SIMULATION

Trending Questions

Popular # Hahtags

Popular Questions