I've got a computerized procedure which runs a test and takes a few hundred samples during testing. When finished, it spits back out the average and standard deviation of the samples.
Now, I've run this test several times on test subjects that should be identical, but of course I get back slightly different average and standard deviation data each time. I only have a record of the average and standard deviation returned at the end of the test, not of each individual sample point. A typical data set will look something like the following:
SUBJECT AVG STD
1 129.2 31.0
2 125.0 37.3
3 123.6 34.7
4 130.1 31.3
... ... ...
Now, if I just average together the averages I get 127.0, but the standard deviation is only 3.2, when in fact the standard deviation between any actual samples is likely to be closer to 30. Is there a way I can combine my summary statistics that preserves the information I have about the standard deviation between samples?
Unfortunately, I don't have access to the size of the data sets which generated the outputs above (it's somewhere around a hundred points, but is different each time and not something I have access to records of).
If the purpose is to estimate population mean and SD, and assuming the sample sizes are equal, then you can get by with estimates based on averages. Averaging the $\bar X$s estimates the population mean.
To estimate the population SD: (1) square the SDs to get variances (2) average the variances (3) take the square root
In R, for the numbers you gave:
This is not quite the same thing as the SD of all the observations combined.
Roughly speaking, the loss in efficiency in estimating the population SD is like losing one observation for each of the separate SDs. Maybe not important if sample sizes are large.
This is based on the method of getting a combined estimate of variance from variances in each of the groups in a one-factor analysis of variance (ANOVA). You can look up the formulas there.