To sum uniform (0,1) random variables and to show the natural logarithm.

386 Views Asked by At

(The question is extracted from Casella and Berger, Statistics Inference exercise $5.58$)

Suppose that $U_1,U_2,...U_n$ are iid uniform $(0,1)$ random variables, and let $S_n=\sum_{i=1}^nU_i$. Define the random variable $N$ by $$N=\min\{k:S_k>1\}$$ (a) Show that $P(S_k\leq t)=t^k/k!$

Given that I know how to show $E(N)=e$,

(b) How large should $n$ be so that you are $95\%$ confident that you have the first four digits of $e$ correct?

2

There are 2 best solutions below

1
On BEST ANSWER

We solve (a) with a proof by induction. While the case $k=0$ is trivial for $t>0$ because $S_0=0$, moving from $k=n$ to $k=n+1$ uses a convolution, viz. $\int_0^t\frac{u^{n-1}}{(n-1)!}(t-u)du=\frac{t^{n+1}}{(n+1)!}$. Note the first factor in the integrand is the PDF, not CDF, of the inductive hypothesis.

For (b) note $P(N=n)=\mathbb{E}P(1-U_n< S_{n-1}\le 1)=\mathbb{E}\frac{1-(1-U_n)^{n-1}}{(n-1)!}=\frac{n-1}{n!}$ (as a sanity check, a telescoping sum on $n\ge 2$ verifies unitarity). Thus $\mathbb{E}N=\sum_{n\ge 2}\frac{1}{(n-2)!}=e$, as you already knew, while $\mathbb{E}N^2=\sum_{n\ge 2}\frac{n}{(n-2)!}=2+\sum_{n\ge 3}(\frac{1}{(n-3)!}+\frac{2}{(n-2)!})=2+e+2(e-1)=3e$ so $\operatorname{Var}N=e(3-e)$. A Normal approximation is all you need to finish the problem.

0
On

First note that part (a) is only true for $t\leq 1$, and that it is enough to treat the case $t=1$.

A nice geometric way to see this without induction is to consider the map $F:R^n\rightarrow R^n$ given by $$F(x)=(x_1,x_1+x_2, \ldots,x_1+\ldots+x_n)$$ $F$ is volume preserving (think of the matrix), and it maps the simplex $$\Delta_n = \{x\in [0,1]^n : \sum_{i=1}^nx_i \leq 1 \}$$ onto $$ U=\{y\in [0,1]^n:y_1\leq y_2\leq\ldots\leq y_n\} $$ The volume of $U$ is clearly $1/n!$ - permuting the indices, $n!$ copies of $U$ cover the cube (the overlaps are of lower dimension).

It is immediate from the definition of $N$ that $\{N=n\} = \{S_{n-1}<1 \text{ and } S_n\geq 1\}$ and therefore $Pr(N=n)=\frac{1}{(n-1)!} - \frac{1}{n!}=\frac{n-1}{n!}$ and therefore $E(N) = \sum_n\frac{n\cdot(n-1)}{n!}=e$. Similarly, $E(N^2) = \sum_n\frac{n^2\cdot(n-1)}{n!}=\sum_n\frac{n+2}{n!}=e+2e$ and therefore $Var(N)=E(N^2)-E(N)^2=3e-e^2=e\cdot(3-e)$.

From CLT, $\frac{\overline{N_n}-e}{e\cdot (3-e)/\sqrt n}\sim N(0,1)$, so $\overline{N_n}$ will land inside $e\pm 2e(3-e)/\sqrt n$ with probability $>95\%$. Now $2e(3-e)\cong 1.53$ so we will need huge $n$ to get to the fourth digit.

Here is a short simulation in $R$ :

> N <- function(n=20){(1:n)[floor(cumsum(runif(20))) > 0][1]}
> round(replicate(10, { mean(replicate(10^5,N())) - exp(1) }), 3)
 [1]  0.002 -0.006  0.003 -0.002  0.004  0.002 -0.006 -0.003  0.002 -0.002

So indeed we see that $n=10^5$ only gets you the second digit right.