Expectation on the number of geometric variables needed for their sum to exceed a threshold

284 Views Asked by At

Let $X_1, X_2, \ldots,$ be a series of independent random variables such that $X_i\sim Geo(p_i)$, and let $T\in\mathbb N$ be a constant positive integer.

Next, define $N\triangleq \min\{{n\in\mathbb N}\mid \sum\limits_{i=1}^nX_i\ge T\}$.

How can we find $\mathbb E[N]$?

(An upper bound would also work).


This seems like a somewhat inverse version of the general version of Wald's equation.

Intuitively, I'd like to say that $\mathbb E[N]\approx\min\{{n\in\mathbb N}\mid \sum\limits_{i=1}^n \mathbb E[X_i]\ge T\} = \min\{{n\in\mathbb N}\mid \sum\limits_{i=1}^n 1/p_i \ge T\}$, but I'm looking for a formal argument.

If this helps, we can assume that $0.01\ge p_1\ge p_2\ge p_3\ge\ldots$


In his answer, Ian gave a solution that handles two extreme cases - one where $\min\{{n\in\mathbb N}\mid \sum\limits_{i=1}^n 1/p_i \ge T\} = \Theta(T)$ and one where the $p_i$'s decrease exponentially and we can just look for the $i$ such that $p_i^{-1}\approx T$ as a good approximation.

However, this approach doesn't seem to work well for what I think are the interesting cases. As we have $p=\prod_{i=1}^d(1-p_i)^{x_i-1}\le \prod_{i=1}^d(1-p_d)^{x_i-1}=(1-p_d)^{T-d}$, this fails in the following example. If we have $p_i = \Theta(1/i)$, then Ian's approach would give an a bound of $\mathbb E[N]\le T/\log T$ (after optimizing $d$), while we can expect $N$ to be of the order of $\sqrt T$. This is a near quadratic gap that this method cannot address.

2

There are 2 best solutions below

1
On

The simplest thing to do is to split the space into a single good set and a single bad set. The good set is meant to be highly probable and yet also have good values of $N$; the bad set is meant to be highly improbable but perhaps contain bad values of $N$, hopefully still not so bad that we can't control them. One way to do this is to consider the "good" set

$$B=B(x)=\{ X_1 \geq x_1,X_2 \geq x_2,\dots,X_d \geq x_d \}$$

where $x$ is a vector of nonnegative integers satisfying $\sum_{i=1}^d x_i=T.$

We have

$$P(B)=\prod_{i=1}^d q_i^{x_1-1}$$

where $q_i=1-p_i$. On $B(x)$, $N \leq d(x)$. On $B(x)^c$, $N \leq T$ (and this is the best we can do uniformly over all of $B(x)^c$). Thus

$$E[N] \leq d(x) P(B(x)) + (1-P(B(x)) T$$

for any $x$. So the question is to find a good $x$. Suppose $f(T)$ is the "mean field" solution, i.e.

$$f(T)=\min \left \{ n : \sum_{i=1}^n p_i^{-1} \geq T \right \}.$$

Presumably $N$ is typically around $f(T)$. So we should choose $x$ with $d(x)$ a bit bigger than $f(T)$, so that $1-P(B(x))$ is small, but not so much bigger that the first term becomes large. If we don't care about constant factors, then there are three cases:

  • $f(T)=\Theta(T)$. In this case $E[N]$ is also $\Theta(T)$, so up to constant factors we're done.
  • $\frac{p_n^{-1}}{\sum_{i=1}^n p_i^{-1}} \to L>0$ as $n \to \infty$. (For example this happens if $p_i$ decay exponentially.) In this case the sum more or less does not matter, the whole thing is dominated by the very last term, so we can think of the first term as just $d p_d^{T-1}$ and the second term as $(1-p_d)T$. Set these equal and solve for $d$, more or less.
  • The intermediate case. In this case you basically need to set $d$ to be a bit bigger than $f(T)$ to take into account fluctuations, and then carefully select $x$ to "evenly distribute the cost". In other words you will want to choose $x$ such that $x_i$ is proportional to $-\ln(q_i)^{-1}$. The extent to which $d$ has to exceed $f(T)$ is so that both terms in the inequality for $E[N]$ are of the same order; the exact behavior of this will vary a fair bit depending on the specifics of the problem. But once you have chosen $d$ it should not be too bad to pick $x_i$ to be a "rounding" of $\frac{T}{Z_d} (-\ln(q_i))^{-1}$ where $Z_d$ is a normalization constant, and thereby obtain a lower bound for $P(B(x))$ and thus an upper bound for $E[N]$ from above.
1
On

You mention you're already interested in the case of constant $p_i=p$. In that case I believe an exact answer is easy. For the moment I assume your geometric random variables take strictly positive integer values, i.e. $P(X_i=k)=(1-p)^{k-1}p$ for $k=1,2,\dots$.

Write $S_n=\sum_{i=1}^n X_i$. Then the sums $S_n$ are distinct. Let us consider the places where the random walk $S_n$ stops; i.e. for each positive integer $m$, consider the event that there is $n$ such that $S_n=m$.

In fact, this event has probability $p$ for each $m$, and the events are independent for different $m$.

(To see this, observe that by basic properties of geometric random variables, we can think of the random walk $S_n$ as follows. To increment the walk, repeatedly step up by $1$, after each step stopping with probability $p$. The probability that any given integer $m$ is a stopping point is $p$, and all these stopping events are independent.)

So in fact the size of the set of stopping points $\{m\in\{1,2,\dots,T-1\}: S_n=m \text{ for some }n\}$ has Binomial$(T-1, p)$ distribution, and so mean $p(T-1)$.

Your $N$ is simply $1$ more than the number of these stopping points. Hence $E[N]=p(T-1)+1$.

If you want instead geometrics which can take the value $0$, i.e. $P(X_i=k)=(1-p)^k p$ for $k=0,1,2,\dots$, then a simple adjustment gives $E[N]=(1-p)^{-1}p(T-1)+(1-p)^{-1}$.