Monte Carlo Sampling with non-uniform distributions?

853 Views Asked by At

I'm currently studying Monte Carlo sampling, referencing Veach's "Robust Monte Carlo Methods for Light Transport Simulation".

On page 63, he writes:

The idea of Monte Carlo integration is to evaluate the integral

$I = \int_{\Omega}f(x)d\mu(x) $

using random sampling. In its basic form, this is done by independently sampling $N$ points $X_1, ..., X_N$ according to some convenient density function $p$, and then computing the estimate

$F_N = \frac{1}{N}\sum_{i=1}^{N}\frac{f(X_i)}{p(X_i)}$

I've been playing around with this, and I understand the the technique when using uniform sampling on arbitrary domains $\Omega=[a, b]$, where $p(X_i) = \frac{1}{b-a}$. I've written a small python test program and it seems to work well.

However, I'm confused by his statement regarding "independently sampling $N$ points $X_1, ..., X_N$ according to some convenient density function $p$".

I'm assuming this means I can choose any arbitrary probability density function I want for $p(X_i)$?

As a simple test, I chose the Gaussian distribution $N(0.5, 0.15)$ to get a PDF centered at $0.5$ and roughly fitted to the interval $[0,1]$. To me, this seems "convenient".

I'm trying to apply the formula by drawing samples $X_i$ using this PDF, and for each iteration of the summation I can evaluate the integrand at each sample as $f(X_i)$, and divide by the probability $p(X_i)$ of each chosen sample.

For simplicity, I'm attempting to integrate trivial functions such as $f(x) = 1$, and $f(x) = x$ etc.

However, this does not seem to work at all, and I get values significantly different from the true value of the integral (eg. we expect $\int_{0}^{1}1 = 1$, $\int_{0}^{1}x = 0.5$), even with large samples sizes of $N = 100,000$ etc. The values I get are off by ~1, and vary noticeably between runs.

I suspect 1 of 2 things: either MC integration of this form requires uniform sampling, or I'm misunderstanding something... I'd appreciate any insights you may have!

1

There are 1 best solutions below

1
On

For example, to compute $\int_0^2f(x)\,\mathrm dx$ you might compute $\int_0^1f(x)\,\mathrm dx$ with $1000$ samples and spearately compute $\int_1^2f(x)\,\mathrm dx$ with $2000$ samples (because the function is "wilder" over that interval and needs more sample for an equally reliable result) and add the parts. This is equivalent to making $3000$ samples over the original interval $[0,2]$ but weighting the samples differently, namely assign twice the weight to those samples in the subinterval with half the probability density (which also means that the weighted number of samples becomes $\sim 4000$). Taking this further, you can use any other nice probability density and the induced weighting by the reciprocal of the density.