Testing $ H_0: p_1 = \ldots = p_k $ with a sequence $ x_1, \ldots, x_k $ of realizations of $ X_1 \sim B(n_1, p_1), \ldots, X_k \sim B(n_k, p_k) $

110 Views Asked by At

Let $ X_1 \sim B(n_1, p_1), \ldots, X_k \sim B(n_k, p_k) $ be independent random variables and $ x_1, \ldots, x_k $ a sequence of realizations of them. I know the value of $n_1, \ldots, n_k $ but not that of $ p_1, \ldots, p_k $ and I'm looking for a way to test $ H_0: p_1 = \ldots = p_k $.

2

There are 2 best solutions below

5
On BEST ANSWER

Note: This is a revision. In the former version, I tried to generalize the usual test for comparing success probabilities that uses data from two two binomial random variables. A correct version of that is more complicated than I first assumed.

After a discussion via Comments beneath @kimchilover's Anwser (+1) I realized that my method was not the same as his suggested test on a $k \times 2$ table, but that the method in that Answer is correct. So I have changed my answer to discuss that method, which is implemented in R and other statistical software.


If $H_0: p_1 = p_2 = \cdots = p_k$ is true, then your have observed $\sum_{i=1}^k X_{i1}$ successes in $\sum_{i=1}^k n_i$ trials altogether. Also, under $H_0$ you have expected successes $E_{i1} = n_i\hat p,$ for $i = 1, \dots, k,$ to compare with the corresponding $k$ observed numbers $X_{i1}$ of successes. Similarly, you have observed failure counts $X_{i2} = n_i - X_{i1}$ and corresponding expected failure counts $E_{i2} = n_i = E_{i1}.$

If $H_0$ is true, then the goodness-of-fit statistic $$Q = \sum_{j=1}^2\sum_{i=1}^k \frac{(X_{ij}-E_{ij})^2}{E_{ij}} \stackrel{\text{aprx}}{\sim} \mathsf{Chisq}(k-1),$$

provided that all $E_i > 5$ (some texts say provided that "most" $E_i > 5$ and all $E_i > 3.$)

Notice that small values of $Q$ correspond to 'good fits' of observed counts to corresponding expected counts. You will reject $H_0$ at the 5% level of significance if $Q \ge q^*,$ where $q^*$ cuts 5% of the probability from the upper tail of $\mathsf{Chisq}(k-1).$

Example: Suppose $k = 5;\; n_1 = n_2 = 20;\; n_3 = n_4 = n_5 = 30.$ Also suppose $\mathbf{X_1} = (5,3,5,8,8),$ so that $\hat p= 29/130 = 0.2231$ and $E_{11} = E_{21} = 20\hat p = 4.462,$ $E_{31} = E_{41} = E_{51} = 30\hat p = 6.692.$ Quantities for the second column are found accordingly.

The computation of Q in R statistical software is shown below:

X1 = c(5,3,5,8,8); X2 = c(15,17,25,22,22)   
MAT = cbind(X1,X2)
chisq.test(MAT)

        Pearson's Chi-squared test

data:  MAT
X-squared = 1.9085, df = 4, p-value = 0.7526

Warning message:
In chisq.test(MAT) : Chi-squared approximation may be incorrect

Thus $Q = 1.9085$ and we cannot reject $H_0$ because the 95th percentile of $\mathsf{Chisq}(4)$ is $q^* = 9.488.$ The warning message is generated because $E_{11} = E_{21} \approx 4.46,$ which is barely below $5$. One would reject for a P-value below 0.05, so we are not anywhere close to rejection. [The code chisq.test(MAT)$exp prints the matrix of $E_{ij}$'s. The code chisq.test(MAT, sim=T) simulates the more reliable P-value $\approx 0.77.$]

[For this example, I simulated the five $X_i$'s using all five $p_i = 1/4,$ so that $H_0$ is true. For real data, of course, one never knows absolutely for sure whether $H_0$ is true or not.]


Consider an experiment with $k = 5$ independent random variables $X_i$ each $\mathsf{Binom}(50,.7),$ so that $H_0$ is true. A histogram of 100,000 values of $Q$ from repeated performances of this experiment is shown below, along with the density function of $\mathsf{Chisq}(4).$ The $Q$-value led to rejection is about 5% of the iterations.

enter image description here

By contrast consider an experiment with $k = 5$ independent binomial random variables $X_i$ each with $n = 50,$ but with $p = .3, .4, .5, .6$ and $.7,$ so that $H_0$ is not true. Here about 97% of 100,000 iterations led to rejection.

enter image description here

3
On

You use a chi-squared test for homogeneity of a $k\times 2$ contingency table.