Knowing $f$, choose a suitable function $g$ so that $\operatorname{supp}(f) \subseteq \operatorname{supp}(g)$.

54 Views Asked by At

For one to use the rejection sampling algorithm, it requires that $\sup_x f(x)/g(x)\le M < \infty$. Of course, this means we are looking for a target density $f$ with fatter tails than the proposed density to sample from $g$, and that $\operatorname{supp}(f) \subseteq \operatorname{supp}(g)$. But is there a way to look at the density of $f$ and choose a suitable proposal?

Say we wanted to sample from a random variable $X$, with density $$ f_X(x) = \frac{2x}{(x^2+1)^2}\mathbb{1}_{(0,\infty)}(x)$$ My course notes then proceed to sample from the standard normal density (truncated to the positive line): $$ g_X(x) = \sqrt{\frac{2}{\pi}}\exp(-x^2/2)\mathbb{1}_{(0,\infty)}(x)$$ There was no explanation from how to choose $g$ as the proposal, and although you could make an educated guess of the shape of $f$ being exponential, and that the characteristic function must be included so that the support of $f$ is in $g$, I am unsure of the origins of the rest of it's structure. Say, if we were to have a much more complex function to sample from and we couldn't plot the shape, how would we decide?

1

There are 1 best solutions below

0
On BEST ANSWER

This is really an art, not a science, but there are some basic principles. In this example the normal distribution is actually a terrible idea because the tail is much less fat than the tail of the desired distribution, so it will take a long time before a "tail sample" is generated in the first place, at which point it will be pretty much guaranteed to be accepted. You could have seen this by noticing that your condition about the supremum is violated as $x \to \infty$.

In a case like this with power law scaling (so the tail is relatively fat), you really just want to get the tail right and sort everything else out through rejection. That means you want a PDF proportional to $x^{-3}$ on $(a,\infty)$ for some $a>0$ and do something else near zero. If you insist on this supremum requirement (which is rather restrictive, and in general not necessary), then you should probably choose the PDF proportional to $x$.

Overall this distribution has two free parameters for you to play with (the two coefficients and $a$, minus one for the normalization requirement). Playing with such "parametric" examples numerically can really help with your intuition.

As you say, you also need to keep track of the supports. In measure-theoretic language you need to ensure that the target distribution is absolutely continuous with respect to the sampling distribution. Otherwise you will never see part of the support of the target distribution (no amount of rejection can fix that). It's technically OK if the sampling distribution is not absolutely continuous with respect to the target distribution. But in that case some of your samples will just be a waste of time because they will be rejected with probability 1.

The other subtle thing is that you are actually rather limited in the things you can analytically sample from easily. With a few exceptions (like the normal distribution), you need a way to quickly evaluate the quantile function of your proposal distribution. If you don't have that, then you would be better off just directly sampling from your original thing (by numerically computing the quantile function of your original thing).