Convergence not improving in a simple Monte Carlo simulation

166 Views Asked by At

I'm writing a complex Monte Carlo code. While debugging, I noticed an unexpected problem that my random number generator was producing. I am confused on why its happening. To illustrate, I've written a simple code.

In the simplified code, I'm generating $1000$ random numbers between $(0,1)$ to calculate the number of numbers below $0.5$. I'm iterating this process for about $50000$ times. I am also taking its mean value after each iteration to plot the convergence. The following is the code:

rng('shuffle');

MxI = 50000;
MxL = 1000;

ResultArray1 = zeros(MxI, 1);
ResultArray2 = zeros(MxI, 1);
        
for cnt = 1:MxI

    xPos = rand(MxL,1);
    BeLow = 0;
    for i = 1:MxL
        if xPos(i) < 0.5
            BeLow = BeLow+1;
        end
    end

    ResultArray1(cnt) = BeLow;
    ResultArray2(cnt) = mean(ResultArray1(1:cnt));

end

figure;
plot(ResultArray2);

The resulting plot shows the number of values below 0.5 after each iteration. Now, if I run the code two times and superimpose the resulting images, I expect the difference between the two plots to gradually decrease. However, that does not happen. At rare occasions, it converges and at other times it would remain at at approximately fixed distance. Sometimes it would even diverge slightly. See the following figure for the results from the last $15000$ steps, The last 15000 steps are shown

Now I admit the values are very close to $500$ (The exact answer). However, wouldn't the two lines converge even more as I increase the iterations? Or am I expecting too much for the pseudo-random number generator?

1

There are 1 best solutions below

3
On BEST ANSWER

It converges, but slowly. Here is my python code comparing iterations [10, 14], [1000, 1004], and [100000, 100004].

They are closer and closer, and the lines change less and less within 5 iterations. You expect their distance decrease about 1/10 after making x100 more iterations, as there is a 1/sqrt(n) convergence.

import numpy as np
import matplotlib.pyplot as plt

rng = np.random.default_rng()

MxI = 100005
MxL = 1000

ResultArray1 = np.zeros(MxI)
s = []

for i in range(2):
    for cnt in range(MxI):

        xPos = rng.uniform(0, 1, MxL)
        ResultArray1[cnt] = (xPos < 0.5).sum() / MxL

    s.append((1000 * ResultArray1.cumsum() / np.ones_like(ResultArray1).cumsum()) - 500)

plt.plot(s[0][10:15])
plt.plot(s[1][10:15])
plt.show()

plt.plot(s[0][1000:1005])
plt.plot(s[1][1000:1005])
plt.show()

plt.plot(s[0][100000:100005])
plt.plot(s[1][100000:100005])
plt.show()

In the picture, the difference is on order of 1, then order of 0.1, then order of 0.01 as it approximately should be:

10 iterations

1,000 iterations

100,000 iterations

As for the math, we are averaging iid Binomial(1000, 1/2) random variables, so they converge to the mean of one of them, which is $1000 \cdot (1/2) = 500$.

The variance is $\textrm{var}((X_1 + \cdots + X_n) / n) = \frac{n \cdot \textrm{var}(X_1)}{n^2} =\frac{\textrm{var}(X_1)}{n}$. So the standard deviation decays as 1/sqrt(n). And that is also the decay of the difference of two of them, as in that case the difference has x2 the variance which also decays as 1/sqrt(n). The $1/\sqrt(5/3.5)\approx 0.83$ improvement is not necessary visible. This 0.83 is true if you take the average distance after both 35000 and 50000 iterations independently and compare.