Constructing joint confidence intervals.

28 Views Asked by At

I get an expected distribution for some measurement from past data, $X$. Then, when I get a new measurement, I can get a p_value for the probability that the same process that generated the past data generated this new measurement as well using the appropriate tail of the distribution.

Let's say the distribution is normal. Now, I'm going to get $n$ such distributions from the past ($X_1, X_2 \dots X_n$). I still want to combine them into a single distribution and then, a p-value for new observations. The question is how to do the combining.

I could just average the distributions. So, get a distribution $X$:

$$X=\frac{\sum_i X_i}{n}$$

This will make the mean of the new distribution the mean of the individual ones. And let's say the $X_i$ have the same variance $\sigma^2$, it'll make the standard deviation:

$$\sigma_X = \frac{\sigma}{\sqrt{n}}$$

So, as I increase $n$, the standard deviation will become smaller and smaller. This is justified if the means of the $X_i$ are the same since it means we're getting surer and surer about the distribution of $X$. But, if the means are different this doesn't make sense. If we see $X_1$ and $X_2$ with very different means, our confidence interval should get larger not smaller when we combine them.

If they have the same means on the other hand, it makes perfect sense to average them and have the variance go down.

So, what function of $X_1, X_2, \dots X_n$ (say $f(X_1,X_2,\dots X_n)$) has these properties:

$$E(f(X_1,X_2,\dots X_n)) = \frac{E(X_1)+E(X_2)+\dots + E(X_n)}{n}\tag{1}$$

And, if $E(X_1)=E(X_2)=\dots =E(X_n)$: $$V(f(X_1,X_2,\dots X_n)) = \sqrt{\frac{V(X_1)+V(X_2)+\dots +V(X_n)}{n^2}}\tag{2}$$

If the expected value differ a lot from each other, the variance $V(f(X_1,X_2,\dots X_n))$ should become larger than equation (2) above; the extent of which is dictated by how far apart the $E(X_i)$ are.