Experiment scenario
I have a scenario where a test suite ( a set of test cases) can detect or not a bug in a program. I run this test suite with 70 different versions of the program, each one with a different bug, and check if the bug was detected or not.
A test case is for example, in a calculator program, "Sum 10 + 10 and the output should be 20, if its not the bug was detected".
So I run this test suite 50 times against the same program, that had diferente bug each time and the test suite detect 32 bugs.
So 32/50 = 0.64, the proportion of bugs detected is 64%.
Doubt
For exemple I have 4 test cases, the first test case has 20/70 proportion of bugs detected, another has 16/70, another 40/70, another 54/70. The mean of this proportions is 0.46. In hypotese testing, we should use this value (0.46) or we can use the global value of the test suite 0.64 (that is not a mean)? Im with this doubt because I already see some exemples that people do with mean and others that seems that dont use the mean.