I have s imple, general question regarding calculating statistic for N runs of the same experiment. Suppose I would like to calculate mean of values returned by some Test. Each run of the test generates $ \langle x_1 ... x_n \rangle$ , possibly of different length. Let's say the statistic is mean. Which approach would be better and why:
- Sum all values from M runs, and then divide by number of values
- for each run calculate average, and then average across all averages
I believe one of the above might beunder/overestimating the mean slightly and I don't know which. Thanks for your answers.
$\def\E{{\rm E}}\def\V{{\rm Var}}$Say you have $M$ runs of lengths $n_1,\dots,n_M$. Denote the $j$th value in the $i$th run by $X^i_j$, and let the $X^i_j$ be independent and identically distributed, with mean $\mu$ and variance $\sigma^2$.
In your first approach you calculate
$$\mu_1 = \frac{1}{n_1+\cdots n_M} \sum_{i=1}^M \sum_{j=1}^{n_i} X^i_j$$
and in your second approach you calculate
$$\mu_2 = \frac{1}{M} \sum_{i=1}^M \left( \frac{1}{n_i} \sum_{j=1}^{n_i} X^i_j\right)$$
You can compute their expectations:
$$\E(\mu_1) = \frac{1}{n_1+\cdots n_M} \sum_{i=1}^M \sum_{j=1}^{n_i} \mu = \frac{(n_1+\cdots n_M)\mu}{n_1+\cdots n_M} = \mu$$
vs
$$\E(\mu_2) = \frac{1}{M} \sum_{i=1}^M \left( \frac{1}{n_i} \sum_{j=1}^{n_i}\mu \right) = \frac{1}{M} ( M\mu ) = \mu$$
so the estimator is unbiased in both cases. However, if you calculate the variances you will find that
$$\V(\mu_1) = \frac{\sigma^2}{n_1+\cdots n_M}$$
and
$$\V(\mu_2) = \frac{1}{M} \left( \sum_{i=1}^M \frac{1}{n_i} \right) \sigma^2$$
With a little effort, you can show that
$$\V(\mu_1)\leq \V(\mu_2)$$
where the inequality is strict except when $n_1=n_2=\cdots=n_M$, i.e. when all of the runs produce the same amount of output. If you need to be convinced of this, work through the details in the case $M=2$, $n_1=1$ and $n_2=N >1$.
Therefore it is better to take your first approach, of summing up the output of all runs and dividing by the total length of the output. The expectation is the same in either case, but the variance is lower with the first approach.