Why do we refer to the denominator of Bayes' theorem as "marginal probability"?

5k Views Asked by At

Consider the following characterization of the Bayes' theorem:

Bayes' Theorem

Given some observed data $x$, the posterior probability that the paramater $\Theta$ has the value $\theta$ is $p(\theta \mid x) = p(x \mid \theta) p (\theta) / p(x)$, where $p(x \mid \theta)$ is the likelihood, $p(\theta)$ is the prior probability of the value $\theta$, and $p(x)$ is the marginal probability of the value $x$.

Is there any special reason why we call $p(x)$ the "marginal probability"? What is "marginal" about it?

2

There are 2 best solutions below

6
On

The explanation I was given when I was taught conditional probabilities is that if you draw up a table of the probabilities $p(x,y)$, then the row/column sums $$ p(x) = \sum_{y} p(x,y) $$ (by the law of total probability) are written in the margins of the table.

2
On

If you consider a joint distribution to be a table of values in columns and rows with there probabilities entered in the cells, then the "marginal distribution" is found by summing the values in the table along rows (or columns) and writing the total in the margins of the table.

$$\begin{array}{c c} & X \\ \Theta & \boxed{\begin{array}{c|cc|c} ~ & 0 & 1 & X\mid \Theta \\ \hline 0 & 0.15 & 0.35 & 0.5 \\ 1 & 0.20 & 0.30 & 0.5 \\\hline \Theta\mid X & 0.35 & 0.65 & ~\end{array}}\end{array}$$