How can I measure validity of the research?

83 Views Asked by At

Two groups of rats were tested. Group A (10 individuals) received medication and had an average lifespan of 45 months. Group B (20 individuals) did not receive medication and had an average lifespan of 33 months. Questions:

  1. Is there enough data to make a conclusion that this research statistically (in)valid?

  2. What are criterias of statistically valid sample?

Thanks in advance. Also the research in question.

Edit1: changed link to the original research paper.

Edit2: found exact variance range of life span (in days):

Group B1      1041 (950-1132)
Group B2      1059 (964-1154)
Group A       1316 (1221-1441)

I think, groups B1 and B2 can be treated as a single group B

For the sake of indexing and to help those who may be looking for the information on this research, the research in question is a study on chromium picolinate by Gary W. Evans and Lynn K. Meyer.

1

There are 1 best solutions below

4
On BEST ANSWER

As noted in the comments, you do need information on the variability of measurements within each group in order to tell whether differences are significant.

Looking at the original paper (linked in your Question) I see several places in which means are given $\pm$ a standard error. The standard error is the (sample) standard deviation divided by the square root of the sample size. Because sample sizes are given, it would be possible to find the standard deviation in these instances.

What is not clear, without a detailed reading of a long article in tiny type, how the results in the various sections of the paper match the results you quote in your question.

Note: I suspect the interval you are calling the 'variance' of the observations in each group is really the interval from the minimum to the maximum. The difference max - min is called 'range'. The 'variance' is the square of the standard deviation. Unfortunately, it is not possible to find the standard deviations from the ranges with sufficient accuracy to perform a valid test of significance.

Ordinarily, I would point you to Wikipedia or a basic statistics text for a discussion of a Welch two-sample t test. However, this test depends on having approximately normal data, and lifetimes are notoriously non-normal (because they are heavily right-skewed).


Revision begins here:

However, suppose we consider two groups, combining B1 and B2 into one group B, as you suggest. Then the spans of the two Groups are:

A: 950 to 1154
B: 1221 to 1441

If we had all the data we would be able to do a Welch 2-sample t test, but it is difficult to deduce exact standard deviations from the implied ranges. (The so-called 'rule' of dividing the range by 5 or 6 to get the SD, mentioned in some texts, doesn't work well enough to be reliable for this purpose.) Anyhow, survival data are seldom normal and sampling independently from two normal populations is an assumption of the t test.

With all of the original data we could also do a Mann-Whitney-Wilcoxon rank sum test, which does not assume normality. But it does assume that measurements are continuous to the extent of not showing tied values.

Also, with all of the original data we could do a permutation test, a nonparametric procedure that does not require either normality or absence of ties. Although we do not have the individual observations usually required for a permutation test, for your particular data, we have something that is just as good: We know that there is no overlap between the survival times for the A and B groups. (The biggest B is 1154 and the smallest A is 1221.)

Assuming no difference between the two groups, the chances of no overlap between two random samples, one of size 10 and the other of size 20 are 1 in ${30 \choose 10} = {30 \choose 20} = 30,045,015.$ So the P-value of the permutation test is less than 1 chance in 30 million. An indication of a very highly significant difference between the survival times of the two groups.

For some additional details of permutation tests, you can look at Eudey et al. (2010), especially Section 4.


The effort to see if you can reproduce the findings of this study from the published information is laudable. Reproducibility of research results is an urgent topic of discussion in scientific journals. Increasingly, reputable journals are insisting that authors make data and computational analyses available as a condition of publication.