Sum of iid variables where number of summands is random

533 Views Asked by At

Let $(X_n)$ be a sequence of iid random variables with mean $\mu$ and variance $\sigma^2\lt \infty$. Set $S_0=0$ and $S_n=X_1+...+X_n$ for $n\ge 1$. Let $N$ be a bounded non-negative integer-valued random variable which is independent of the sequence $(X_n)$. a) Show $E(S_N)=\mu E(N)$. b) Find $E(S^2_N)$ and $\text{Var}( S_N)$ in terms of $\text{Var}(N)$.

Now consider the case where $X_1$ only takes values $1$ and $-1$. Fix $a\ge 0$ and set $T=\min\{n\ge 0:|S_n|=a\}$. c) Show $E(S_T)=\mu E(T)$ and find $\text{Var}(S_T)$

I've solved a) and b) using the law of total expectation but I'm at a loss as to why we need the assumption that $N$ is independent of the sequence $(X_n)$ and this is preventing me from making sense of c). For instance, my steps for a) are essentially

  1. $E(S_N)=E(E(S_N|N=n))$
  2. $E(S_N)=\sum_n E(S_n)P(N=n)$
  3. $E(S_N)=\sum_n \mu nP(N=n)$
  4. $E(S_N)=\mu E(N)$

The steps for b) are very similar with only algebraic manipulation beyond the above. To the best of my knowledge, every step above did not care about what $N$ is independent of, so, I could just replace $T$ with $N$ thus the results for c) are the same in terms of $E(T)$ and $\text{Var}(T)$?(but surely the question setter would not do that).

Please don't answer this problem for me, I will be very grateful for a small nudge in the right direction just so I can proceed with the question myself.

EDIT: I am interested in whether this question has not been answered because I have explained my intentions poorly or if my case is plausible? Please leave me a comment either way.

3

There are 3 best solutions below

1
On

Let me try to give you a helping hand:

You have used the independence of $N$ and $X_1,X_2,...$ exactly at this step:

$$E[S_N|N=n]=E[S_n]$$

To see why this is not true in the dependent case, take $X_i$ i.i.d. $\operatorname{Bernoulli}\left(\frac{1}{2}\right)$ (coin tosses), and $N$ be the number of consecutive $0$s (obviously $N$ depends on $X_2,X_2,\ldots$).

What happens now when $N=n$? It means the first $n$ $X$s are zero and the $n+1$ is one:

$$X_1=X_2=\ldots=X_n=0\text{ and }X_{n+1}=1$$

So:

$E[S_N|N=n]=E[X_1|N=n]+\ldots+E[X_n|N=n]=0+\ldots+0 =0$

However:

$E[S_n]=E[X_1]+\ldots+E[X_n]=\frac{1}{2}+\ldots+\frac{1}{2}=\frac{n}{2}$

Hope this helps to clear up your confusion.

Note: One more minor thing, your Step 1. for a) should read $E(E(S_N|N))$ instead of $E(E(S_N|N=n))$, so the first two steps should be:

$E(S_N))=E(E(S_N|N))=\sum_n E(S_N|N=n)P(N=n)=\sum_n E(S_n)P(N=n)$

(the independence was used for the last equality)

0
On

Using the law of total expectation you get

$$\mathbb{E}(S_N)=\mathbb{E}[\mathbb{E}(S_N|N=n)]$$

Being

$$\mathbb{E}(S_N|N=n)=\mathbb{E}\left[\sum_{i=1}^{N}X_i|N=n\right]=\sum_{i=1}^{n}\mathbb{E}[X_i]=n \mu$$

You get

$$\mathbb{E}(S_N)=\mathbb{E}(N\mu)=\mu \mathbb{E}(N)$$


$$\mathbb{E}(S_N^2)=\mathbb{E}[\mathbb{E}(S_n^2|N)]$$

and we have

$$\mathbb{E}(S_N^2|N)=\left(\sum_{k=1}^{N} \sum_{h=1}^{N}X_kX_h|N=n \right)=\sum_{k=1}^{N} \sum_{h=1}^{N}E(X_kX_h)=\sum_{k=1}^{N} \sum_{h=1}^{N}\left[ \mathbb{Cov}(X_k,X_h)+\mu^2 \right]$$

Given that $X_1,X_2,\dots,X_N$ are independent, they are also uncorrelated, thus

$$\mathbb{E}(S_N^2|N)=n\sigma^2+n^2\mu^2$$

and thus

$$\mathbb{E}(S^2)=E(N\sigma^2+N^2\mu^2)=\mathbb{E}(N)\sigma^2+E(N^2)\mu^2$$

Finally the variance can be calculated as

$$\mathbb{V}[S]=\mathbb{E}[S_N^2]-\mathbb{E}^2[S_N]=\dots=\mathbb{E}[N]\sigma^2+\mu^2\mathbb{V}[N]$$

2
On

In your item $2$ and subsequent items you appear to be assuming that $\ E\big(S_n\,\big|\,N=n\big)=$$E\big(S_n\big)\ $. This isn't necessarily true if $\ N\ $ is not independent of the sequence $\ \big\{X_n\big\}\ $.

Suppose, for example that $\ \big\{X_i\}\ $ are Bernoulli $p$-variables (with $\ 0<p<1\ $) and $\ N=\max(2X_1,X_2)+1\ $. Then $\ E\big(X_1\,\big|\,N=n\big)=$$\,\delta_{3n}\ $, $\ E\big(X_2\,\big|\,N=n\big)= \delta_{2n}\ $, $\ E\big(X_3\,\big|\,N=n\big)=p=\mu\ $ , and $\ E(N)= (1-p)^2+2\big(p-p^2\big)+3p=1+3p-p^2\ $.

For $\ n=1\ $ \begin{align} E\big(S_n\,\big|\,N=n\big)&=E\big(X_1\,\big|\,N=1\big)\\ &=0\\ &\ne E\big(S_n\big)=p\ , \end{align} for $\ n=2\ $ \begin{align} E\big(S_n\,\big|\,N=n\big)&=E\big(X_1\ \big|\,N=2\big)+E\big(X_2\ \big|\,N=2\big)\\ &=1\\ &\ne E\big(S_n\big)=2p\ \ \ \Big(\text{unless }\ p=\frac{1}{2}\ \Big) , \end{align} and for $\ n=3\ $ \begin{align} E\big(S_n\,\big|\,N=n\big)&=E\big(X_1\ \big|\,N=3\big)+E\big(X_2\ \big|\,N=3\big)+E\big(X_3\ \big|\,N=3\big)\\ &=1+p\\ &\ne E\big(S_n\big)=3p\ \ \ \Big(\text{unless }\ p=\frac{1}{2}\ \Big). \end{align} Also \begin{align} E\big(S_N\big)&=E\big(S_1\,\big|\,N=1\big)P(N=1)+E\big(S_2\,\big|\,N=2\big)P(N=2)\\ &\hspace{2em}+E\big(S_3\,\big|\,N=3\big)P(N=3)\\ &=0+(1-p)p+p(1+p)\\ &=2p\\ &\ne \mu E(N)=p\big(1+3p-p^2\big) \ \ \Bigg(\text{unless }\ p=\frac{3-\sqrt{5}}{2}\ \Bigg). \end{align}

Hints for part c)

  • Since $\ T\ $ isn't independent of $\ \Big\{X_n\Big\}\ $, your $4$-step procedure breaks down at step $2$, for the reason explained above. However, $\ S_T\ $ has a fairly simple distribution which enables $\ E\big(S_T\big)\ $ and $\ \text{Var}\big(S_T\big)\ $ to be calculated directly
  • If you put $\ e_s=E\big(\min\big\{n\ge0\,\big|\,\big|s+S_n\big|=a\big)\ $, you can derive a second order linear recurrence which $\ e_s\ $ must satisfy, and this can be solved by standard techniques to obtain a formula for $\ e_s\ $as a function of $\ s\ $, and then $\ E(T)=e_0\ $.
  • If you let \begin{align} \phi_s=P\big(&\min\big\{n\ge0\,\big|\,s+S_n=a\big\}>\\ &\min\big\{n\ge0\,\big|\,s+S_n=-a\big\}\big)\end{align} you should also be able to set up and solve a similar second order linear recurrence to obtain a formula for $\ \phi_s\ $. The quantity $\ \phi_0\ $ is then the probability that $\ S_n\ $ first reaches $\ a\ $ before it first reaches $\ -a\ $, and $\ 1-\phi_0\ $ is the probability that it reaches $\ -a\ $ first.
  • To calculate the distribution of $\ S_T\ $, you can use the formula \begin{align} P\big(S_T=x\big)&=\sum_{n=1}^\infty P\big(S_T=x\,\big|\,T=n\big)P(T=n)\\ &=\sum_{n=1}^\infty P\big(S_n=x\,\big|\,T=n\big)P(T=n)\ . \end{align} In this formula, what is the value of $\ P\big(S_n=x\,\big|\,T=n\big)\ $ if $\ x\ne a\ $ and $\ x\ne -a\ $ ? What is its value if $\ x=a\ $? If $\ x=-a\ $?