How to know the probability distribution without knowing the probability density?

174 Views Asked by At

I have an example in my textbook with the following:

Let $\pi$ be the target distribution with a strictly positive continuous density $g$ on $A$, that is

$\pi(A)=\int\limits_{A}g(x)dx$

...then later on the book explains that we are trying to approximate the Boltzman distribution (which would be $\pi$ in this case), using the hit and run, which is an algorithm that requires knowing $g(x)$, the pdf.

I'm wondering how is it possible to know $g(x)$ without knowing $\pi(A)$?

The best answer I can come up with is that it is possible to have a constant added to $g(x)$, like in differential equations, which might prevent us from knowing its exact form. But is that the only reason?

EDIT: I'll admit this question is vague, but maybe what I'm trying to get at is - how do we approximate a distribution without sampling from that distribution? (for some distributions that are difficult to sample, such as Boltzmann?)

EDIT AGAIN AFTER SETTING BOUNTY: There is a common example for this type of sampling which I will use here to illustrate my confusion: Say there is a circle inside of a square, and we wish to know the probability of randomly throwing a pebble and landing inside of the circle (obviously the answer is $\frac{\pi}{4}$). I understand that we can use a sampling method like MCMC (Markov Chain Monte Carlo) to do this. But, in this case the sampling will tell us the outcome, with a 1 being inside the circle, and a 0 being outside the circle. On a continuous distribution this is analogous to knowing a priori the pdf on the 2-d surface. In which case we already know the pdf of the distribution which we are trying to approximate?

1

There are 1 best solutions below

5
On BEST ANSWER

Instead of representing $\pi(A)=\int_A g(x)dx$, which specified that $g(x)$ already exists, assume for a second that you just had a function $\pi:\mathcal{A}\rightarrow [0,1]$ with $\mathcal{A}$ the collection of all subsets of the support.

Working in the univariate domain, $\pi(A)$ is the mass assigned to $A$, and could be represented using the CDF: $G(x) = \int_{-\infty}^x g(x)dx$. In this situation $g(x) = \frac{d}{dx} G(x)$.

The important thing here is to note that $\pi$ can be well defined, and $G(x)$ can be well defined, without $g(x)$ being defined at all. $\pi(A)$ is just a function mapping from $\mathcal{A}$ to the set $[0,1]$, and $G(x)$ is just equal to $\pi([-\infty,x])$.

So because the derivative does not always exist, $G(x)$ and $\pi(A)$ can exist without $g(x)$ being well defined.