What is the distribution of $N-X_N$ if $X_i$'s are i.i.d $\operatorname{Exp}(1)$ and $N=\min\{n\ge1:X_n>1\}$?

555 Views Asked by At

Suppose $(X_n)_{n\in\mathbb N}$ is an i.i.d sequence of random variables where $X_1$ has an exponential distribution with mean $1$. Let $N=\min\{n\ge1:X_n>1\}$. I am asked to find the distribution of $N-X_N$.

Now $N$ has a geometric distribution, with

$$P(N>n)=P(X_1\le 1,\ldots,X_n\le 1)=(1-e^{-1})^n\quad,\,n\in\mathbb N$$

And it can be shown that distribution function of $X_N$ is just

$$P(X_N\le x)=P(X_1\le x\mid X_1>1)\,,$$

so that $X_N$ has a shifted exponential distribution with density

$$f_{X_N}(x)=e^{-(x-1)}1_{x>1}$$

Since a joint distribution of $N$ and $X_N$ is not given, the question would have made sense if $N$ and $X_N$ were independent. In any case, I tried to find the distribution function as

\begin{align} P(N-X_N\le x)&=\sum_{n=1}^\infty P(n-X_n\le x,N=n) \\&=\sum_{n=1}^\infty P(X_n\ge n-x,X_1\le 1,\ldots,X_{n-1}\le 1,X_n>1) \\&=\sum_{n=1}^\infty P(X_1 > \max(n-x,1))(P(X_1\le 1))^{n-1} \\&=\sum_{n=1}^\infty e^{-\max(n-x,1)}(1-e^{-1})^{n-1} \\&\stackrel{?}=\sum_{n=1}^{x+1}e^{-1}(1-e^{-1})^{n-1}+\sum_{n=x+2}^\infty e^{-(n-x)}(1-e^{-1})^{n-1} \end{align}

The last two sums can be evaluated, but I am not sure if I arrive at a valid answer. Is there a simpler way to solve this, perhaps using some independence argument? Is it guaranteed that $N-X_N$ will be absolutely continuous? I could say that if $N$ and $X_N$ were independent, but don't think that is true here.


@Henry has pointed out that $N$ and $X_N$ are indeed independent. I think I understand the logic but would like to see a formal proof of the independence.

Assuming independence, I get

\begin{align} P(N-X_N\le y)&=\int P(N\le y+x)f_{X_N}(x)\,dx \\&=\int_{1-y}^\infty \left(1-(1-e^{-1})^{y+x}\right)e^{-(x-1)}dx\,1_{y<0}+\int_1^\infty \left(1-(1-e^{-1})^{y+x}\right)e^{-(x-1)}dx\,1_{y>0} \end{align}

Does this look right?

1

There are 1 best solutions below

1
On

As I said in the comments, I think it is clear that $X_N$ is independent of $N$, since $X_N$ is an exponential distribution truncated at $1$ and its value does not depend on how many previous failures there have been.

I think if you battle through the calculations, then you end up with the cumulative distribution function $$\mathbb P(X_N \le x)=1 - \left(1-e^{-1}\right)^{\lceil x\rceil} \left(1 - \frac{e^{x-\lceil x\rceil}}{e-1+e^{-1}}\right)$$ where $\lceil x\rceil$ is the smallest non-negative integer greater than or equal to $x$, i.e. the ceiling of $x$ or $0$ if that is larger.

As an illustration of this, here is a simulation in R, looking at an empirical cdf of the original question, compared with an empirical cdf using independence and with my theoretical cdf. The blue, red and green lines in the graph clearly match well, and you can see the kinks at the non-negative integers.

set.seed(1)
cases <- 10^4

N <- numeric(cases)
XN <- numeric(cases)
for (i in 1:cases){
    dat   <- rexp(100,1) 
    N[i]  <- min(which(dat > 1))
    XN[i] <- dat[N[i]]
    }
plot.ecdf(N - XN, col="blue")

Nind <- rgeom(cases, exp(-1)) + 1  # rgeom starts at 0
XNind <- rexp(cases, 1) + 1        # must be >= 1 here
plot.ecdf(Nind - XNind, col="red", add=TRUE)

above <- function(x){
    ifelse(x < 0, 0, ceiling(x))
    }
pNminusXN <- function(x){
    1 - (1-exp(-1))^above(x) * (1 - exp(x-above(x))/(exp(1)-1+exp(-1)))
    }
curve(pNminusXN, from=-5, to=15, col="green", add=TRUE)

ecdf