I'm writing a complex Monte Carlo code. While debugging, I noticed an unexpected problem that my random number generator was producing. I am confused on why its happening. To illustrate, I've written a simple code.
In the simplified code, I'm generating $1000$ random numbers between $(0,1)$ to calculate the number of numbers below $0.5$. I'm iterating this process for about $50000$ times. I am also taking its mean value after each iteration to plot the convergence. The following is the code:
rng('shuffle');
MxI = 50000;
MxL = 1000;
ResultArray1 = zeros(MxI, 1);
ResultArray2 = zeros(MxI, 1);
for cnt = 1:MxI
xPos = rand(MxL,1);
BeLow = 0;
for i = 1:MxL
if xPos(i) < 0.5
BeLow = BeLow+1;
end
end
ResultArray1(cnt) = BeLow;
ResultArray2(cnt) = mean(ResultArray1(1:cnt));
end
figure;
plot(ResultArray2);
The resulting plot shows the number of values below 0.5 after each iteration. Now, if I run the code two times and superimpose the resulting images, I expect the difference between the two plots to gradually decrease. However, that does not happen. At rare occasions, it converges and at other times it would remain at at approximately fixed distance. Sometimes it would even diverge slightly. See the following figure for the results from the last $15000$ steps,

Now I admit the values are very close to $500$ (The exact answer). However, wouldn't the two lines converge even more as I increase the iterations? Or am I expecting too much for the pseudo-random number generator?
It converges, but slowly. Here is my python code comparing iterations [10, 14], [1000, 1004], and [100000, 100004].
They are closer and closer, and the lines change less and less within 5 iterations. You expect their distance decrease about 1/10 after making x100 more iterations, as there is a 1/sqrt(n) convergence.
In the picture, the difference is on order of 1, then order of 0.1, then order of 0.01 as it approximately should be:
10 iterations
1,000 iterations
100,000 iterations
As for the math, we are averaging iid Binomial(1000, 1/2) random variables, so they converge to the mean of one of them, which is $1000 \cdot (1/2) = 500$.
The variance is $\textrm{var}((X_1 + \cdots + X_n) / n) = \frac{n \cdot \textrm{var}(X_1)}{n^2} =\frac{\textrm{var}(X_1)}{n}$. So the standard deviation decays as 1/sqrt(n). And that is also the decay of the difference of two of them, as in that case the difference has x2 the variance which also decays as 1/sqrt(n). The $1/\sqrt(5/3.5)\approx 0.83$ improvement is not necessary visible. This 0.83 is true if you take the average distance after both 35000 and 50000 iterations independently and compare.