Expected values in a sequence

1.3k Views Asked by At

We draw 2 numbers from a normal (gauss) distribution with mean $\mu$ and variance $\sigma$ and we add them to find the first value $a_1$ of a sequence. The second value $a_2$ of this sequence is the product of $a_1$ and 2 other numbers randomly drawn from the same probability distribution. The third value $a_3$ is the product of $a_2$ by 2 other randomly drawn numbers, and so forth....

Therefore: $a_1=X_1+X_2$ and $a_2=a_1\cdot (X_3+X_4)=(X_1+X_2)(X_3+X_4)$

What is the expected value of $a_n$?

Similar question here

UPDATE

Here is a plot of a very simple simulation where $a_n$ equals $a_{n-1}$ multiplied by a number drawn from a gaussian distribution with parameters $\mu=2$ and $\sigma=0$ for the red line and $\sigma=0.5$ for the blue line. I can run this simulation 100 times, in 95% of the cases, I get this kind of pattern. Therefore it seems that the variance is important to predict a value in a sequence. Is it correct?enter image description here blue = high variance | red = low variance | y-axis is logarithmic

2

There are 2 best solutions below

0
On BEST ANSWER

As already mentioned, $E[a_n]=(2\mu)^n$ for every $n$, whatever the variance is. An explanation for the deviation observed in the simulations might be the following (although I fail to see how the graph in the post could represent any sequence $(a_n)$ generated as the OP explains, since almost surely $a_n\lt0$ for infinitely many $n$... unless one plots $|a_n|$ and not $a_n$?).

Assume that $(x_n)$ is i.i.d. normal with mean $1$ and variance $v$, and consider $y_n=x_1x_2\cdots x_n$, then $y_n$ is $a_n/(2\mu)^n$ for some variance $v$ and indeed, $E[y_n]=1$. But the almost sure behaviour of $y_n$ (the one simulations would exhibit) is quite different.

To wit, considering the normal density $f_v$ with mean $1$ and variance $v$, one sees that $x_k$ is in the infinitesimal interval $(x,x+\mathrm dx)$ approximately $nf_v(x)\mathrm dx$ times hence $$ \sum_{k=1}^n\log|x_k|\approx\int\log|x|\,nf_v(x)\mathrm dx. $$ This heuristics can be made rigorous, which shows that, when $n\to\infty$, $|y_n|=\mathrm e^{nI(v)+o(n)}$, where $$ I(v)=\int_\mathbb R\log|x|\,f_v(x)\mathrm dx=\frac1{\sqrt{2\pi}}\int_\mathbb R\log|1+x\sqrt{v}|\,\mathrm e^{-x^2/2}\mathrm dx. $$ Thus, the ratio between the blue curve and the red curve should behave like $\mathrm e^{nI(v)}$. Qualitatively, this explains the straight lines in the simulations.

Quantitatively, the parameters used in the simulation (mean $2$, variance $1/2$) correspond to $v=1/8$, and $I(1/8)\approx-.0825$, which is negative, hence indeed the ratio goes to zero, almost surely. For $n=800$, this indicates a ratio red/blue of order $10^{29}$, which is much greater than what the simulations indicate.

My guess is that the parameters are actually mean $2$ and standard deviation $1/2$ (not variance). This guess yields $v=1/16$, $I(1/16)\approx-.0352$ and a ratio red/blue of order $10^{13}$, which seems compatible with the simulations.

4
On

Let $\{X_i\}_{i=1}^{\infty}$ be i.i.d. $N(\mu,\sigma^2)$ random variables (representing the random numbers drawn from your distribution; here $\mu = mean$ and $\sigma^2=var$). Then we have $$ a_1 = X_1 + X_2,\, a_2 = (X_1 + X_2)(X_3+X_4), \,...,\, a_n = (X_1+X_2)(X_3+X_4) \cdots (X_{2n-1}+X_{2n}).$$ Now, using independence, $$ E[a_n] = \underbrace{E[X_1+X_2]E[X_3+X_4]\cdots E[X_{2n-1}+X_{2n}]}_{n \text{ factors}} = \big(E[X_1]+E[X_2]\big)^n=2^n\mu^{n}. $$ You should note that by assumption $ E[X_i]=\mu$ for each $i$, so $$E[X_i + X_{i+1}]=E[X_{i}]+E[X_{i+1}]=2\mu$$ which is where the $2\mu$ came from.