hypothesis testing using conditional distribution

175 Views Asked by At

Let $X_1 \sim \operatorname{Geo}(p_1)$ and $X_2 \sim \operatorname{Geo}(p_2)$ be independent random variables, where $\operatorname{Geo}(p)$ refers to the Geometric distribution whose p.m.f. $f$ is given by

$$f(k) = p(1−p)^k,\qquad\qquad k = 0,1,\ldots$$

What is the conditional distribution of $X_1\mid X_1+X_2=y$, assuming $p_1=p_2$

Here is how I proceeded :-

$$P(X_1=t\mid X_1+X_2=y) \Rightarrow \frac{P(X_1=t \cap X_1+X_2=y)}{P(X_1+X_2=y)} \Rightarrow \frac{P(X_1+X_2=y\mid X_1=t)\cdot P(X_1=t)}{P(X_1 + X_2=y)}$$

$$\Rightarrow \frac{P(X_2=y-t)\cdot P(X_1=t)}{P( X_1+X_2=y)}$$

Now we know that $X_1+X_2 \sim \operatorname{NegativeBinomial}(y,2)= {y+2-1 \choose {2-1}} p^2(1-p)^y = (y+1)p^2(1-p)^y$

$$\therefore\quad \Rightarrow \frac{P(X_2=y-t)\cdot P(X_1=t)}{P( X_1+X_2=y)} \Rightarrow \frac{p^2(1-p)^y}{(y+1)p^2(1-p)^y}=\frac 1 {y+1}$$

To test $H_0 : p_1 = p_2 \text{ against the alternative } H_1 : p_1 < p_2. $ Based on the result obtained in (a), derive a level 0.05 test for $H_0$ against $H_1$ that rejects $H_0$ when $X_1$ is large.

Any idea about how to do this problem. I know how to use likelihood ratios to derive test statistic , but in this case the test statistic does not depend on either $p_1 \text{ or } p_2$.

1

There are 1 best solutions below

1
On

Your derivation of the conditional distribution is a bit more complicated than it needs to be. You can proceed as follows: \begin{align} & \Pr(X_1=t\mid X_1+X_2=y) = \frac{\Pr(X_1=t \cap X_1+X_2=y)}{\Pr(X_1+X_2=y)} \\[10pt] = {} & \frac{\Pr(X_1=t\cap X_2=y-t)}{\Pr(X_1+X_2=y)} = \frac{p(1-p)^t\cdot p(1-p)^{y-t}}{\Pr(X_1+X_2=y)} = \frac{p^2(1-p)^y}{\cdots} = \cdots \end{align} Thus the conditional distribution is a discrete uniform distribution (as you already found).

To find the hypothesis test, one finds the probability distribution of the test statistic assuming the null hypothesis is true. In this case, assuming $p_1=p_2.$ If $p_1<p_2,$ then $X_1$ is more likely to be less than $X_2$ than not, whereas if $p_1=p_2,$ then those $X_1<X_2$ and $X_1>X_2$ are equally probable. You found that given $X_1+X_2,$ the first variable $X_1$ is uniformly distributed on the set $\{0,1,2,\ldots, X_1+X_2\}.$ If $X_1$ is smaller than usual one would then reject the null hypothesis. So you want $$ \Pr(X_1 < \text{what?} \mid X_1+X_2) \le 0.05. $$ I.e. it's in the lowest $5\%$ of its range.