Running minimal estimation error for Monte Carlo

87 Views Asked by At

Let $ X_n $ be i.i.d. real valued random variables with zero mean and unit variance. Let $ S_n $ be the sum of the first $n$ variables (a discrete time random walk in 1D): $$ S_n:=\sum_{m\leq n} X_m. $$ The law of the iterated logarithm says that $|S_n|$ is almost surely bounded by $\sqrt{2n\log{\log{n}}}$.

In other words: if we use the Monte Carlo estimator $\overline{X}_n:=S_n /n$ to estimate the mean $E[X_1]=0$, then the error decays almost surely at the rate above divided by $ n $: $$ \limsup_{n\to\infty}\frac{\sqrt{n}}{\sqrt{2\log\log n}}|\overline{X}_n|\leq 1 $$

My question: how good is the best estimate that Monte Carlo produces until a time $ n $? That is, what can be said about $$ R_n:=\min_{m\leq n} |\overline{X}_m|? $$ As an example if what I am looking for, assume that the variables $X_n$ are bounded in absolute value by $ C $. Then $$ \liminf_{n\to\infty}nR_n\leq C $$ because each time the random walk $S_n$ passes the origin, it comes closer to the origin than $ C $, so $ R_n $ is smaller than $ C/n $.

I could imagine, but not prove, three obvious improvements:

(1) Does the result hold for unbounded $X_n$?

(2) Can the rate be improved? I would imagine we can get the rate $n^{-2+\epsilon} $ for any $\epsilon>0$. (Sketch: In a time interval of length $n$, the random walk has $\approx c n$ crossings ($c>0$ some constant), and the distance to the origin at each crossing is somewhere between $0$ and $C$. Therefore, between the times $n$ and $2n$ there are $cn$ crossings, and for one of those crossings the distance to the origin should be closer than $C/(cn)$. Thus $R_n$ achieves $C/(cn^2)$ at this crossing.).

(3) What happens in higher dimensions? Obviously, the crossing approach does not work anymore, but can we still get rates that are better than $\sqrt{2\log\log n}n^{-1/2}$?

You can see below plots of $\overline{X}_n$ and $R_n$ in one dimension and in two dimensions (replace absolute value by norm in two dimensions).

The plot in one dimension suggests that the best rate is around $n^{-3/2}$ in contradiction to (2). The plot in two dimensions suggests that the best rate is $n^{-1}$. For three dimensions the numerically suggested rate was indistinguishably close to $n^{-1/2}$.

Furthermore, the plots looked similar for unbounded and bounded increments $X_n$, suggesting that (1) is true.

Finally, the plots suggest one more extension of my result:

(4) Is it true that in one dimension we can also bound the supremum: $\limsup n^{s}R_n<\infty$ for some $s>1/2$? It it true that this cannot be done in higher dimensions?

EDIT That the rate is at least $n^{-1}$ in two dimensions can be proven similar to the one dimensional argument for this rate by recurrency of the random walk $S_n$: Fix a ball around the origin. Each time the random walk hits this ball (which it does infinitely often), $R_n$ is smaller than $C/n$. Note also that in three dimensions the random walk is transient, which destroys this argument, but which does not strictly prove that no better rate than $n^{-1/2}$ is possible.

One dimension Sample paths of $\overline{X}_n$ (Sample paths of $\overline{X}_n$)

Sample paths of $R_n$ (Sample paths of $R_n$)

Two dimensions enter image description here (Sample paths of $\overline{X}_n$)

enter image description here (Sample paths of $R_n$)