I'm working on a project comparing the accept-reject (von Neumann) and Metropolis methods of sampling. I'm generating a sample of size $N$ from a normal distribution $N(1,(\frac{1}{2})^2)$. What I found was that for the Accept-Reject method, the algorithm is much slower, but gives very good results even for relatively small $N$. On the other hand, the Metropolis algorithm I'm using works much quicker, but even for very big $N$ (for example $10^5$), it gives a worse result.
Are my results correct? Is this the main difference between the two algorithms?
This is the MATLAB code I'm using for the Metropolis algorithm:
function [X, total] = metropolis(f,N,b) %METROPOLIS Generate pseudorandom numbers using Metropolis algorithm. % f - pdf to be simulated % M - upper bound on pdf % N - size of generated sample % b - interval from which to sample
dx = diff(b)/100;
total = 0;
n = 1;
X = zeros(N,1);
X(1)=0.5*(max(b)+min(b));
while n < N
total = total + 1;
xt = X(n)-dx + rand(1)*2*dx;
if f(xt) >= f(X(n))
X(n+1) = xt;
n = n+1;
else
r = rand(1);
if r < f(xt)/f(X(n))
X(n+1) = xt;
n = n+1;
else
X(n+1) = X(n);
n = n+1;
end
end
end
The performance of the Metropolis method depends heavily on the class of trial Metropolis steps considered. (In this 1-D case, on the step size among which you choose your trial step.) If the step size is very large, the Metropolis method behaves like the accept-reject method.
If the step size is too small, then you run into the problem of long correlation lengths. That is, although you are getting (say) $10^5^ samples, if the correlation length is 10,000 steps, your effective sample size is only 10 points. Then your statistical sample will likely "give a bad result."
The Metropolis method shines in situations where the space of possible points is high-dimensioned and you can't analytically place a tight limit on your sample space, so that the accept/reject method very rarely gives an accept. Here, the accept /reject method might take way too long to provide a statistically significant set of sample points.
In that case, if there is a natural and computationally efficient way to calculate the probability difference for a step which is large in a local sense, yet small in the overall probability sense, then Metropolis is very useful. For example, in lattice gauge theory, the change in action associated with a change in a single link on the lattice is small enough that the acceptance rate is fairly high. And even though the correlation "time" may be thousands of steps, this is a much more efficient way to generate an ensemble of (farily) decorrelated samples than accept/reject would be.