I was doing a particular example from the book Epidemiologic Research by Kleinbaum(example 15.6) and didn't understood some basic statistical aspect.
$$\text{Table:Cumulative Type of Data from the Evans Country Heart Disease Study}$$
$$ \begin{array}{|c|cc|} \hline & \text{HI CAT} & \text{LO CAT} \\ \hline \text{CHD} & 27 & 44 \\ \text{No CHD} & 95 & 443 \\ \hline \end{array} $$
We have to check whether there is association between CAT and CHD.
here,
a=$27$, b=$44$,c=$95$,d=$443$,$n_1=122$,$n_0=487$,$m_1=71$,$m_0=538$,$n=609$
so,Cumulative Incidence Ratio, $$\hat {CIR}=\frac{\hat {CI_1}}{\hat {CI_0}}=\frac{a/n_1}{b/n_0}=\frac{27/122}{44/487}=2.45$$
The exact P-value for a test of $H_0:CIR=1$ versus $H_A:CIR>1$,based on the hypergeometric distribution,is given by the expression
$$Pr(A \ge 27|H_0)=\sum_{j=27}^{71}\frac{\binom{122}{j}\times \binom{487}{71-j}}{\binom{609}{71}}\ldots ❶$$
Since this computation needs computer, an approxmation to this exact p-value can be obtained from the Mantel-Haenszel $\chi_1^2$ statistic
$$\chi_1^2=Z^2=\frac{(n-1)(ad-bc)^2}{n_1n_0m_1m_0}=16.22$$
it follows that, $$pr(A\ge a|H_0)\approx (1/2)pr(\chi_1^2\ge 16.22|H_0)$$
(1) I have not understood why there is $(1/2)$ in the right side ?
$$pr(A\ge a|H_0)\approx (1/2)pr(\chi_1^2\ge 16.22|H_0)\approx pr(Z>4.03|H_0)$$
for calculating $pr(Z>4.03|H_0)$ , they have used
$$Z=\frac{(\hat {CI_1}-\hat {CI_0})-0}{\sqrt{\hat {CI}(1-\hat {CI})[(1/n_1)+(1/n_0)]}}$$
(2)I have not understood why $ (1/2)pr(\chi_1^2\ge 16.22|H_0)\approx pr(Z>4.03|H_0)$.Where is (1/2) in the right side?And Why $pr(Z>)$ instead of $pr(Z\ge)$
$$pr(A\ge a|H_0)\approx (1/2)pr(\chi_1^2\ge 16.22|H_0)\approx pr(Z>4.03|H_0)<0.0003$$ Thus, the crude analysis provides strong evidence of CAT-CHD relationship solely on the basis of the data in table1.
(3) Why have they compared $pr(Z>4.03|H_0)$ with $0.0003$? And why $pr(Z>4.03|H_0)<0.0003$ provides strong evidence of CAT-CHD relationship ?
As far i calculated $pr(Z>4.03|H_0)=0.00002788$
$P(Z\ge 4.03)+P(Z\le -4.03)=P(|Z|\ge 4.03)=P(Z^2\ge 4.03^2)=P(\chi ^2_1\ge 4.03^2)$
But $P(Z\ge 4.03)=P(Z\le -4.03)$ so $P(\chi ^2_1\ge 4.03^2)=2P(Z\ge 4.03).$ There's the mysterious $1/2.$
Also $P(Z>4.03)=P(Z\ge 4.03)$ because $Z$ is a continuous rv and so $P(Z=4.03)=0.$
And $ P(Z\ge 4.03)\lt 0.0003 $ does not mean they are comparing it to 0.0003, I suspect they are approximating the normal probability. Maybe their Z table only goes up to $3.50$.
And of course, based on hypothesis testing logic, if the calculated p-value is small, like less than $\alpha=0.05$, then we reject the null hypothesis. Are you clear what the null and alternative are in this case?
And the question about x* is not clear. If the calculated test statistic turns out to be 0 in this case, you get perfect agreement between the data and the null hypothesis. So the conclusion is obvious. You need to review basic hypothesis testing.