I've run a test with one control and one experiment group, and am questioning myself on whether or not I've used the right test (or if significance can even be calculated on the following sample sizes).
The data is as follows:
The control cohort (A) had 63 people see the treatment and 1 person performed the action (1.59%)
The Experiment cohort (B) had 64 people see the treatment and 9 people performed the action (14.1%)
I used a z-test for two population proportions (this equation: http://www.socscistatistics.com/tests/ztest/) to compare the two proportions. It says that the number of people who performed the action in B is a statistically significant increase over the number of people who performed the action in A with a p-value of 0.00453.
However I wanted to make sure that:
- a) I'm using the right test -- I know t-tests are sometimes better tests to use when samples sizes are small
- b) Statistical significance can even be determined on such a small sample
Any help or guidance would be great - thanks!
The test you referenced is the standard z test for the difference of two binomial population proportions. It assumes that the z-statistic is approximately normal. I got $|Z| \approx 5.83$ which is significant at any reasonable level of significance (including 1%, as you mention). I do not see the applicability of the t distribution to this situation.
However, because the counts of 'successes' are so small, I wondered about the validity of the normality assumption. So I did a Fisher exact test in R, obtaining a P-value between 1% and 2%, so it certainly seems legitimate to reject the null hypothesis (that the two groups react similarly) at the 5% level.
Addendum: More specifically, one often sees the following rule-of-thumb to determine whether the test statistic in a z-test is sufficiently near normal to give reliable P-values. The minimum of these four numbers should be at least 5 (less-fussy authors say 4): the two success counts and the two failure counts. (In your case you'd have $\min(1,9,62,55) = 1 < 3,$ and so you shouldn't rely on the z-test.) However, in your case it isn't anywhere near a close call whether to reject at the 5% level.