Confidence interval for the absolute value of the mean

1.4k Views Asked by At

I wondered if, given a confidence interval for the mean M, like

M included in [-a, b], with confidence 95% with a>0, b>0 

it is correct to deduce that the absolute value of the mean:

|M| = | [(1/N) * sum(x-M)] |

is comprised between [0, max(a,b)], with same confidence ? Thanks in advance

1

There are 1 best solutions below

4
On BEST ANSWER

Disclaimer: I’m not a statistician, but I teach some statistics.

Good question!

This may be your question: Given a population of values for some quantitative variable $x$ and a fixed sample size $N$, suppose that the interval* $[-a,b]$ (where $-a<0<b$) captures exactly 95% of the distribution of sample means of size $N$ from the population. Then you want to know if is it the case that the interval $[0,\max(a,b)]$ captures 95% of the distribution of absolute values of sample means $\bar x$ of size $N$ from the same population.

The answer is no, if you mean “captures exactly 95%,” but yes if you want to know whether $[0,\max(a,b)]$ captures at least 95% of the distribution of absolute values of sample means of size $N$.

First of all, the phrase “the 95% confidence interval” is potentially imprecise, because it’s possible to choose more than one interval that contains 95% of the sample means. Let’s assume a conventional CI that also the property that the 5% of sample means that fall outside the CI are equally split between the two “outsides” of the CI. That assumption makes the definition of the CI unambiguous.***

If $[-a,b]$ is the “balanced CI” for the population, 2.5% of the sample means are less than $-a$, 95% of them are between $-a$ and $b$, and 2.5% of them are greater than $b$.

Suppose for the moment that $a<b$, so that $\max(a,b)=b$. You want to know if $[0,b]$ contains 95% of the absolute values of the sample means. It contains at least that many. All of the sample means that lie between $-a$ and $b$ (which is 95% of them) have absolute values between $0$ and $b$. All of the sample means that lie to the right of $b$ (which is 2.5% of them) have absolute values that are not in $[0,b]$.

However, the remaining 2.5% of sample means (those less than $-a$) may or may not have absolute values in $[0,b]$. There isn’t enough information to determine how many do. Those sample means that are less than $-a$ but not less than $-b$ do have absolute values in $[0,b]$. Those that are less than $-a$ and also less than $-b$ do not. Since exactly 2.5% of the sample means are less than $-a$, the interval $[0,b]$ contains between 95% and 97.5% of all absolute values of sample means.**

*Note that in practice, a CI is computed from a single sample and straddles the sample mean in such a way that under reasonable assumptions, one can assume with 95% confidence that it contains the actual (unknown) population mean. My description is for an interval that, given a known distribution, straddles the population mean in such a way that 95% of all possible sample means lie within the interval.

**If the distribution of sample means is known, an exact answer can be found; for normally distributed sample means, as an example, it will depend on the normalized difference between $|a|$ and $|b|$.

***If we don’t make the CI balanced in this way, the interval $[0,b]$ is still known to capture between 95% and 100% of the absolute values, but we don’t know that it captures at most 97.5%.