Bayes' theorem with multiple variables

2.6k Views Asked by At

On the page: https://en.wikipedia.org/wiki/Bayesian_inference#Formal_description_of_Bayesian_inference there is the result:

$$p(\theta \mid \mathbf{X},\alpha) = \frac{p(\mathbf{X} \mid \theta) p(\theta \mid \alpha)}{p(\mathbf{X} \mid \alpha)} $$

which I am having trouble deriving. Here's my attempt. We have:

$$p(\theta, \mathbf{X},\alpha) = p(\theta \mid \mathbf{X},\alpha)p(\mathbf{X}, \alpha) = p(\theta \mid \mathbf{X},\alpha)p(\mathbf{X} \mid \alpha)p(\alpha)$$ also:

$$p(\theta, \mathbf{X},\alpha) = p( \mathbf{X}\mid \theta, \alpha)p(\theta, \alpha) = p( \mathbf{X}\mid \theta, \alpha)p(\theta\mid \alpha)p(\alpha)$$

Equating these, we have: $$p(\theta \mid \mathbf{X},\alpha) = \frac{p( \mathbf{X}\mid \theta, \alpha)p(\theta\mid \alpha)p(\alpha)}{p(\mathbf{X} \mid \alpha)p(\alpha)} =\frac{p( \mathbf{X}\mid \theta, \alpha)p(\theta\mid \alpha)}{p(\mathbf{X} \mid \alpha)} $$

which is not quite the same as the expression given, because we have a $p( \mathbf{X}\mid \theta, \alpha) $ term rather than a $p(\mathbf{X}\mid \theta) $ term. Where am I going wrong?

1

There are 1 best solutions below

1
On BEST ANSWER

Bayes' Theorem states that: $$\begin{align} \mathsf P(A\mid B, C) & = \dfrac{P(B\mid A, C)\;P(A\mid C)}{P(B\mid C)} \\[2ex] \therefore p(\theta \mid \mathbf X,\alpha) & = \dfrac{p(\mathbf X\mid \theta, \alpha)\; p(\theta\mid \alpha)}{p(\mathbf X\mid \alpha)} \end{align}$$

Now $\mathbf X$ is a vector of data points $x_i$ each from a distribution determined by parameter $\theta$, which in turn has a distribution determined by the (hyper)parameter $\alpha$.   This means $p(\mathbf X\mid \theta, \alpha) = p(\mathbf X\mid \theta)$.   (Because if you know what the parameter is, then the hyperparemeter adds no additional information towards determining the probability measure of the vector of data points.)

Thus: $$\begin{align} p(\theta \mid \mathbf X,\alpha) & = \dfrac{p(\mathbf X\mid \theta)\; p(\theta\mid \alpha)}{p(\mathbf X\mid \alpha)} \end{align}$$