Distribution of a quotient of empirical averages

25 Views Asked by At

Consider a sequence of N random variables $y_1, y_2, \dots, y_N$ such that $y_i \sim B(p)$ (a Bernoulli with parameter $p$). What is the distribution of $\frac{\bar{y}}{1 - \bar{y}}$? where $\bar{y}$ is the $N$-empirical average.

In fact, I am trying to solve the slightly more difficult question of determining the distribution of $\log\big(\frac{\bar{y}}{1 - \bar{y}}\big)$, but I understand that the Delta Method should be available to me. Thank you!

1

There are 1 best solutions below

2
On BEST ANSWER

I assume that the random sequence is also independent.

Then

$$ \sum_{i=1}^n Y_i \sim \text{Binomial}(n, p) $$

$$ \Pr\left\{\sum_{i=1}^n Y_i = y\right\} = \binom {n} {y}p^y(1-p)^{n-y}, y = 0, 1, \ldots, n$$

So for $y = 1, \ldots, n$, excluding $y = 0$ and $y = n$ cases first, consider

$$ \begin{align} \Pr\left\{\sum_{i=1}^n Y_i = y\right\} &= \Pr\left\{\bar{Y} = \frac {y} {n} \right\} \\ &= \Pr\left\{\frac {1} {\bar{Y}} = \frac {n} {y} \right\} \\ &= \Pr\left\{\frac {1} {\bar{Y}} - 1 = \frac {n} {y} - 1 \right\} \\ &= \Pr\left\{\frac {1} {\displaystyle \frac {1} {\bar{Y}} - 1} = \frac {1} {\displaystyle \frac {n} {y} - 1} \right\} \\ &= \Pr\left\{\frac {\bar{Y}} {1 - \bar{Y}} = \frac {y} {n - y} \right\} \end{align} $$

And the above result also satisfy $y = 0$ case, as $$ \sum_{i=1}^n Y_i = y = 0 \iff \frac {\bar{Y}} {1 - \bar{Y}} = 0 = \frac {0} {n - 0} $$ When $y = n$, the result also holds if we include $+\infty$: $$ \sum_{i=1}^n Y_i = y = n \iff \frac {\bar{Y}} {1 - \bar{Y}} = +\infty $$

So we may conclude the pmf:

$$ \Pr\left\{\frac {\bar{Y}} {1 - \bar{Y}} = \frac {y} {n - y} \right\} = \binom {n} {y}p^y(1-p)^{n-y}, y = 0, 1, \ldots, n $$

We can put $k = \frac {y} {n - y} \iff y = \frac {nk} {k + 1}$ to explicity write out the pmf:

$$ \Pr\left\{\frac {\bar{Y}} {1 - \bar{Y}} = k \right\} = \binom {n} {\frac {nk} {k + 1}}p^{\frac {nk} {k + 1}}(1-p)^{\frac {n} {k + 1}}, k = 0, \frac {1} {n - 1}, \frac {2} {n - 2}, \ldots, n - 1, +\infty $$

Similar for $\log$ transformation:

$$ \Pr\left\{\ln\left(\frac {\bar{Y}} {1 - \bar{Y}}\right) = \ln\left(\frac {y} {n - y}\right) \right\} = \binom {n} {y}p^y(1-p)^{n-y}, y = 0, 1, \ldots, n $$

Or equivalently, $$ \Pr\left\{\ln\left(\frac {\bar{Y}} {1 - \bar{Y}}\right) = k \right\} = \binom {n} {\frac {ne^k} {e^k + 1}}p^{\frac {ne^k} {e^k + 1}}(1-p)^{\frac {n} {e^k + 1}}, k = -\infty, \ln\frac {1} {n - 1}, \ln\frac {2} {n - 2}, \ldots, \ln(n - 1), +\infty $$

For asymptotic distribution, first recall by CLT,

$$ \sqrt{n}\left(\bar{Y} - p\right) \stackrel {d} {\to} \mathcal{N}(0, p(1-p)) $$

Note that

$$ \frac {d} {dx} \ln\left(\frac {x} {1 - x}\right) = \frac {1} {x(1 - x)}$$ $$ \frac {d^2} {dx^2} \ln\left(\frac {x} {1 - x}\right) = \frac {2x - 1} {x^2(1-x)^2} $$

Taylor expand the expression around $\bar{Y} = p$: $$ \ln\left(\frac {\bar{Y}} {1 - \bar{Y}}\right) \approx \ln\left(\frac {p} {1 - p}\right) + \frac {1} {p(1-p)} (\bar{Y} - p) + \frac {2p - 1} {2p^2(1-p)^2} (\bar{Y} - p)^2 $$

If we just keep the first order term, which gives the result from Delta Method: $$ \sqrt{n}\left[\ln\left(\frac {\bar{Y}} {1 - \bar{Y}}\right) - \ln\left(\frac {p} {1 - p}\right) \right] = \frac {1} {p(1-p)} \sqrt{n}(\bar{Y} - p) \stackrel {d} {\to} \mathcal{N}\left(0, \frac {1} {p(1-p)}\right)$$

We can also write out the second order term: $$ \begin{align} &~ \sqrt{n}\left[\ln\left(\frac {\bar{Y}} {1 - \bar{Y}}\right) - \ln\left(\frac {p} {1 - p}\right) \right] \\ = &~ \mathcal{N}\left(0, \frac {1} {p(1-p)}\right) + \frac {1} {\sqrt{n}} \frac {2p - 1} {2p(1-p)} \left[\sqrt{n}\frac {\bar{Y} - p} {\sqrt{p(1-p)}} \right]^2 \\ = &~ \mathcal{N}\left(0, \frac {1} {p(1-p)}\right) + \frac {1} {\sqrt{n}} \frac {2p - 1} {2p(1-p)} \chi^2(1) \end{align}$$

So the second order chi-square term goes to zero as $1/\sqrt{n}$ goes to zero. In case you are doing a finite sample approximation and considering $\sqrt{n}$ is not too large, you may consider to add the second order term.