I am working on a marketing campaign analysis between treatment-A and treatment-B. The goal is to find which treatment yields more clicks. Say, I have one month of experiment duration and at the end I want to determine if one treatment is statistically better than the other. I can formulate the problem in two ways:
t-test: I find the average daily click rates of A and B, and then perform a t-test.
chi2 test: I find the total clicks of these A/B treatments at the end, and perform a chi2 test.
Can someone comment on the differences of these two approaches?
Would you say that A is the same as B? Probably not, because more people visited the site under treatment B and it had less of a click rate. At this point, you can do a 2-proportion test or a chi-square test to compare if there is a difference between treatment A and treatment B in terms of getting people to click. They are the same test. Here is how you would do it using a 2-proportion test.
Hand-calculating the z test statistic
-out
So we get a p-value of 84%, suggesting there is no difference between the two proportions. (This is not too surprising, because I constructed the proportions to be quite close together.) The square of the z-statistic that we just computed is actually the chi-squared statistic, as we will see later on.
Using the prop.test command We can also directly use the built-in prop.test command.
-out
From the third line, you can see that the chi-sq test-statistic (which is the square of the z-test statistic) and p-value both match from previously.
Chi-squared by hand Now here is how to do the chi-square test by hand. Note that it is exactly the same test as we have just previously done. The effect of it is to compare the observed data to what we would have expected if there is no difference in treatments. That is, it compares the first table with the following table, called the expected table.
The code is as follows
-out
Chi square test command Or, using the built-in command, we get
Conclusion
The two methods you propose are fundamentally different. One tests if there is an overall difference in the proportion of clicks garnered by treatment A versus treatment B and can be conducted if there is a difference in number of sites visited under treatment A and B. The other method requires that the number of sites visited under treatment A and treatment B are the same and then compares the clicks per day of the two treatments.