How to estimate a confidence interval on a box plot?

10.6k Views Asked by At

I am new to the box plot graph and have a really hard time understanding it. And I've also just learned what a confidence interval is. I am unsure of whether you can or can't take a confidence interval of a box plot? If you can, how can you? Any advice to get a better understanding of box plots?

1

There are 1 best solutions below

0
On

The interquartile range (IQR), which is the height of the box in a boxplot (drawn vertically), is related to the variability of the sample, but is not primarily intended as an estimate of $\sigma.$ (If data are normal, then this is sometimes done.)

Some computer programs show a nonparametric confidence interval (CI) for the population median. In Minitab this CI is indicated by a second, smaller, box. In R statistical software the CI is indicated by 'notches' in the sides of the main box.

If boxplots of two independent samples are shown side-by-side, the notched CIs shown using R are constructed so that lack of overlap of the CIs indicates statistically different population medians at the 5% level of significance.

Here is a boxplot from Minitab for a sample of size 50 from an exponential population with mean 1. The vertical extent of the brown box is the CI for the population median. [The population in this case has median $\eta = 0.6931 < \mu = 1;$ The sample median is $H = 0.790$ (the location of the horizontal bar within the boxes.]

enter image description here

The boxplots below are from R. Samples of size 50 are from two different exponential distributions The non-overlapping notches indicate that the samples come from populations with different medians.

enter image description here

Note: A characteristic of exponential samples is that they tend to have outliers on the high side of the median because exponential populations are positively skewed. All three of the boxplots above happen to show at least one outlier on the high side.