1D Bayesian Inference clarification

111 Views Asked by At

I'd like some help making sure I understand a 1D Bayesian inference problem. Stats.stackexchange wasn't helpful.

Say I have a data vector which is an array of the number of flu cases reported weekly in California for the past 10 years.

I want to compare two models which describe this data, via the odds ratio: $$\mathcal{O}_{ij} = \frac{P(M_i|D,I)}{P(M_j|D,I)} = \frac{P(M_i|I)\,\mathcal{L}(M_i)}{P(M_j|I)\,\mathcal{L}(M_j)}$$ where $$\mathcal{L}(M) = \int d\theta P(\theta|M_i)\mathcal{L}(\theta|D,M).$$

$M$ is model, $D$ is data, I'm not sure what $I$ represents.

I've developed two models, each of which are 1D vectors, one of which has parameters.

I'm new to Bayesian statistics. My confusion here is: what are $\mathcal{L}(\theta|D,M)$, $P(M|I)$, and $P(D|I)$ in relation to the dataset and the two models? I Think $P(D|I)$ is just the data. $P(M|I)$ is the prior but I don't know what that relates too, unless that's the model, in which case I don't know what the likelihood function $\mathcal{L}$ is.

Could someone help clarify this for me?

1

There are 1 best solutions below

3
On

I'll assume that $M_i$ and $M_j$ are two different models, $D$ is the observed data, $I$ is prior information, and $\theta$ is a parameter.

To formulate each model, we have available prior information that tells us how probable that model is before observing the data: $$ p(M_i|I),\qquad p(M_j|I).$$

Each model represents a way of specifying the prior distribution of the unknown parameter: $$p(\theta|M_i), \qquad p(\theta|M_j).$$

In addition, given the parameter, we can specify the distribution of the data (sometimes referred to as the likelihood of the parameter): $$\mathcal L (\theta) = p(D|\theta).$$

The objective is to compare the odds ratio of the two models, to try and gauge which of them is more probable given the observed data $D$. To this end, we need to compute $$\mathcal O_{ij} = \frac{p(M_i|D,I)}{p(M_j|D,I)};$$ If $\mathcal O_{ij} > 1$, the data and the information support $M_i$ more than $M_j$ (the converse is true if $\mathcal O_{ij} < 1$).

The problem is now that we do not have the posterior distribution of the models given the data. In theory, this could be computed using Bayes' theorem, $$p(M_i | D,I) = \frac{p(M_i, D, I)}{P(D,I)} = \frac{P(D|M_i) p(M_i|I)}{p(D,I)},$$ but the normalization constant $p(D,I)$ is usually impossible to evaluate. However, this factor is common both in $p(M_i|D,I)$ and $p(M_j|D,I)$, so when we compute the odds ratio we can disregard it! $$\mathcal O_{ij} = \frac{p(D|M_i)p(M_i|I)}{p(D|M_j)p(M_j|I)}.$$

Now we just need to compute $p(D|M_i)$; this is sometimes called model evidence, and we can find it using the sum-rule of probability theory

$$ p(D|M_i) = \int p(D,\theta |M_i) \mathrm d \theta = \int p(D|\theta)p(\theta|M_i) \mathrm d \theta = \int \mathcal L(\theta) p(\theta|M_i) \mathrm d \theta. $$