Is it possible to express the posterior of the function of a parameter in terms of the posterior of the parameter?

51 Views Asked by At

Given parameter $\theta$ with posterior $p(\theta\mid D)$ where $D$ data and any function $f(\theta)$ we may write $$E(f(\theta)\mid D)=\int_\theta f(\theta) p(\theta\mid D)\,d\theta.$$

By Monte Carlo simulation we can generate $\theta$ from the posterior and apply transformation $f$ to it to obtain samples from $p(f(\theta)\mid D)$ and average the Monte Carlo samples to obtain an approximation of $E(f(\theta)\mid D)$.

My question is: is it possible to write $p(f(\theta)\mid D)$ in terms of the posterior $p(\theta\mid D)$ similar to the way in which I can relate the expectation of the posterior of $f(\theta)$ to the posterior of $\theta$ above? Put differently, how can I prove the Monte Carlo generation of samples from $p(f(\theta)\mid D)$ is valid?

1

There are 1 best solutions below

4
On BEST ANSWER

Your equation $E(f(\theta)\mid D)=\int_\theta f(\theta) p(\theta\mid D)\,d\theta$ follows the usual rules for the expectation of a function of a random variable. Also, the Law of Large Numbers says that the sample mean of $n$ randomly chosen observations from the distribution of a random variable approaches the expectation of the random variable. So I provided you trust your pseudorandom generator, I don't see anything to prove.

If I am missing something, maybe you can explain whether and where the following simple example breaks down. Suppose $\theta$ is the success probability of a device, the prior distribution is $\mathsf{Beta}(2,2),$ and data show $x = 10$ successes in $n = 25$ trials. Then the posterior is

$$p(\theta|x) \propto p(\theta)p(x|\theta) \propto \theta^{2-1}(1-\theta)^{2-1} \times \theta^{10}(1-\theta^{15}) \propto \theta^{12-1}(1-\theta)^{17-1},$$ which is the kernel of $\mathsf{Beta}(12,17),$ where $E(\theta|x) = 12/(12+17) = 0.4137931.$

If $f(\theta) = 2\theta + 5$, then $E(f(\theta)|x)) = 2E(\theta|x) + 5 = 5.827586.$

A simple simulation in R is as follows:

m = 10^7;  alp = 12;  bta = 17
x = rbeta(m. alp, bta)
mean(x);  mean(2*x+5)
## 0.4137326      # aprx E(theta|x)
## 5.827465       # aprx E(f(theta)|x)

Addendum replying to Comment:

In general if $Y$ has PDF $f_Y(y)$ and $h(y)$ is an increasing or decreasing function of $y$ in the support of $Y$, then $U = h(Y)$ has PDF $f_U(u) = f_Y(h^{-1}(u))|J|$ where $J = d(h^{-1}(u))/du.$ For example, if $Y \sim \mathsf{UNIF}(0,1),$ then $U = \sqrt{Y} \sim \mathsf{BETA}(2,1).$ This is called something like 'PDF transformation method' in undergrad probability texts. [This would work for a conditional PDF.]

If you simulate realizations of $Y \sim \mathsf{UNIF}(0,1)$ and find $U=\sqrt{Y},$ then the $U$'s will be realizations of $\mathsf{BETA}(2,1).$ With simulation, the transformation doesn't need to be monotone.

But that's analogous to what I did for $h(x) = 2x+5$ above. [I used a linear transformation function so the expectation would be obvious.] So I'm still wondering if I understand what you want.