Understanding "Comparison" method in statistics

31 Views Asked by At

I am struggling a lot with statistics so I decided to try David Freedman's Statistics book.

In the book, first chapter, there is this explanation:

A controlled experiment to show the vaccine was effective. For this two million children were involved, and a half were vaccinated. A million were deliberately left unvaccinated, as controls; half a million refused vaccination. In the Salk vaccine field trial, the treatment and control groups were of different sizes, but that did not matter. The investigators compared the rates at which children got polio in the two groups - cases per thousand. Looking at rates instead of absolute numbers adjusts for the difference in the sizes of groups.

I did not understand the part where the author talks about rates being used to compare instead of absolute values. How does that not allow any effect to be visible on the group sizes?

It would be really helpful if someone could give me an example for the same.

1

There are 1 best solutions below

0
On

I think the author is writing somewhat informally. Working with percentages, as here, certainly mitigates the effect of sample size but it does not eradicate it. To take an extreme case, if one of the samples had only one person in it, there's be little a statistician could do.

Here, though, saying that, e.g., $5\%$ of one pool caught the disease vs $.1\%$ of the other would surely be statistically significant where, saying, e.g., that $50$ in one pool caught the disease vs $63$ in the other is useless if you don't know the sample size. If, say, the first pool had $1000$ people in it and the second had $10000$ then that $50$ would look a lot larger than the $63$. Indeed, looking at percents, we'd have $5\%$ of the first pool vs. $.63\%$ of the second.

In any real life experiment, one would expect to have access to the exact make up of the pools, and that should be critical in measuring the significance of any observed result.