In order to better understand the link between hypothesis testing and likelihood, I was recently trying to compare value of binomial probability and Chi square pearson test in the coin toss (or equivalently random bit generator).
If we run 100 coin toss experiments, the probability of finding 50 tails and 50 head is around 8%, given binomial model. Indeed, the likelihood of the binomial model with p=0.5 (my null hypothesis) given the 50/50 outcome is 0.08, which in itself does not means much, but comparing with other models (p=0.1, p=0.2) one can see that this is actually a pretty good likelihood.
The problem I have is when I try to analyse the result with chi square pearson test. My null hypothesis is the binomial model with p=0.5. First I don't understand why I can use the Chi square model, what makes it valid in this case, as it is supposed to be used for normally distributed data (what is supposed to be normally distributed here ?)
Also, and this is a much bigger issue to me, when computing chi square value with one dof in this case (sist.chi2.pdf(Q,1)), I obtain inf value in the case where the result is 50/50, ie a Q of 0.
Why am I seeing this result ? am I right saying that Chi2 test is completely meaningless in the case of 1 dof ?
thank you in advance for your help.
If $p$ is the parameter for the true proportion of heads observed, then for a sample size of $n$, the number of heads observed is a binomial random variable $$X \sim \operatorname{Binomial}(n = 100, p = 0.5).$$ Then $$\Pr[X = 50] = \binom{100}{50} (0.5)^{50} (1 - 0.5)^{100 - 50} \approx 0.0795892.$$ Using the normal approximation to the binomial, with continuity correction, this probability may be computed by modeling $X$ as approximately normal with mean $\mu = np = 50$ and variance $\sigma^2 = np(1-p) = 25$, consequently the random variable $$Z = \frac{X - \mu}{\sigma} = \frac{X - 50}{5} \approx \operatorname{Normal}(0,1)$$ is approximately standard normal. Therefore $$\begin{align*}\Pr[X = 50] &= \Pr[49.5 \le X \le 50.5] \\ &= \left[ \frac{49.5 - 50}{5} \le \frac{X - 50}{5} \le \frac{50.5 - 50}{5} \right] \\ &\approx [-0.1 \le Z \le 0.1] \\ &= \Pr[|Z| \le 0.1]. \end{align*}$$ Moreover, $$Z^2 \sim \operatorname{ChiSquare}(\nu = 1),$$ so the above may be equivalently expressed as $$\Pr[Z^2 \le 0.01] \approx 0.0796557.$$ This is reasonably close to the actual value shown above. Continuity correction is required since the binomial probability is computed for a single elementary outcome, which without correction would result in a normal (or chi-squared) probability of zero (note: $\Pr[Z = 0] \ne f_Z(0)$; that is, probability and density are not equivalent).
Note also that hypothesis testing is not an appropriate context in which to structure the computation. If the null hypothesis is $$H_0 : p = 0.5$$ then your alternative hypothesis is $$H_a : p \ne 0.5$$ for a two-sided test. But the chi-squared statistic if you observe exactly the same number of heads as tails is zero; this would of course be considered a $p$-value of $1$ and a failure to reject $H_0$ in favor of $H_a$. But you cannot use this test to make an inference about the truth of the null hypothesis, because the test statistic is computed under the assumption that the null hypothesis is true.