Here is the exact question:
A biologist wanted to know if the Cowpea Weevil has a preference for one type of bean over others as a place to lay eggs. She put equal amounts of four types of seeds into a large container and randomized by mixing and then added adult Cowpea Weevils. After a few days, she observed the following data. Do these data provide evidence of a preference for some types of beans over others?
And here is the data:
Type of Bean : Number of Eggs
Pinto : 167
Cowpea : 176
Navy : 174
Northern : 194
I used a $X^2$ test with the expected values being 177.75. Doing this, I get $X^2 = 2.9761$. However, the statistic should be $X^2 = 2.23$. I haven't done statistical tests for a while, so I'm not sure if I'm using the wrong test or if I'm just performing it wrong.
Your null hypothesis is that all four types of beans are equally preferred by the weevil, and the alternative is that not all are equally preferred.
Your observed counts in the four levels of your categorical variable are $X = (167, 176, 174, 194),$ for a total of $711$ beans with eggs.
Under the null hypothesis, the expected number of beans of each type with eggs is $E = 711/4 = 177.75.$
The chi-squared goodness-of-fit statistic $$Q = \sum_{i=1}^4 \frac{(X_i - E)^2}{E} \stackrel{aprx}{\sim} \mathsf{Chisq}(df=3).$$
The critical value for a test at level 5% is $Q^* = 7.815$ and the computed value of the test statistic is $Q = 2.232 < Q^*,$ so you cannot reject the null hypothesis. Remember that large observed values of $Q$ indicate bad fit to the equally-likely model. [I can't figure out how you got 2.9761.]
The P-value of your test is 0.526. Thus, if all types of beans are equally preferred, a value as large as 2.232, would occur more than half the time. [One rejects the null hypothesis if the P-value is smaller than 5%.]
Below is a sketch of the density function of $\mathsf{Chisq}(3).$ The vertical broken red line is at the computed value of $Q,$ to the right of which about 0.526 of the area under the curve lies. The vertical black line, beyond which 0.05 of the area lies, is the critical value.
Notes: (1) This Question illustrates the importance of actually doing the goodness-of-fit test, rather than trying to make judgments looking at descriptive graphs of categorical data. In a bar chart of the four counts the last count (for Northern) might look "impressively" taller than the other three. (2) The population mean of the distribution $\mathsf{Chisq}(3)$ is $\mu = 3,$ and its median is $\eta = 2.3660.$ So the observed $Q$ is smaller than both the mean and the median--certainly not an extreme outcome.