Confidence interval for categorical data

56 Views Asked by At

I have the following data:

       ND  NI  SC  PI  PA
Green  27   3   9   4   7
Blue   24  14  14   6   8

I want to do the following:

  1. Write the multinomial model for the data and the hypothesis for the two possible categorizations.

  2. Do an appropriate test for independence between the two possible categorizations?

  3. Make a 95% independence interval for the ratio of colors in the PA-category.

My approach:

1.

I create the following models:

$$ M_0:\quad \{X_{ij}\} \sim \mathrm{Multinom}(116,\{\pi_{ij}\}) \\ \pi_{ij} \ge 0, \sum_{ij}\pi_{ij}=1 $$

$$ M_1:\quad \{X_{ij}\} \sim \mathrm{Multinom}(116,\{\pi_{ij}\}) \\ \pi_{ij}=\alpha_i\beta_j \\ \alpha_i \ge 0, \sum_{i}\alpha_i=1, \quad \beta_j \ge 0, \sum_j \beta_j=1 \\ $$

I formulate the hypothesis as the following:

$$ H_0: \quad \pi_{ij}=\alpha_i \text{ for alle } i=1,2 \text{ and } j=1,2,\dots ,5 $$

2.

I do this test in R. Since I get an expected value below 5, I use Fisher's test instead of the G test:

mat=rbind(c(27,3,9,4,7),c(24,14,14,6,8))
fisher.test(mat)

Output:
p-value = 0.1377

So I cannot reject the hypothesis that the two categories are independent.

3.

Here I have a hunch that I should find the confidence interval for a binomial distribution, but I'm stuck. I would like to calculate this in R too. I would appreciate help.