Need help with P(D) in a Bayesian model

90 Views Asked by At

So I've been reading about Bayesian models so I tried I'd have a toy example I could play with.

Consider the following: You are at a bus stop and you observe the bus arriving at various times $t_1, t_2, t_3, t_4, .., t_{10}$. You know that the buses are scheduled to arrive at periodic time intervals of $1 < p < 2$ minutes and you know the first bus is scheduled to arrive at precisely 2 minutes after you start observing the buses. The buses, however, don't arrive precisely on time. For a given expected arrival time, their arrival actually is a draw from a normal distribution centred on the arrival time and variance 1. You know nothing more about $p$ so you can assume it has uniform probability on $[1,2]$.

Now, we can use a simple Bayesian model. I'll walk you through my steps, just to make sure I did not misunderstand something fundamental (and to clear my own thoughts). We use:

$$P(H|D) = \frac{P(D|H)P(H)}{P(D)}$$

$P(H)$ (the probability of the hypothesis) is easy, it's the uniform distribution of $p$ on $[1,2]$.
$P(D|H)$ (the probability of the data, given the hypothesis) is a bit more complicated, but it's the probability of those independent observations $t_1,..,t_{10}$ happening, given a fixed $p$. Since you know the distribution of each $t_i$ and the observations are independent, you can calculate the distribution.

Now, what I am confused about is $P(D)$ (the probability of the data). This is the probability the observations happen, but from which space are we drawing this? The guides online i have seen say that this is the probability of D without any prior. But isn't this the whole thing about Bayesian models, that there always is a prior although sometimes hidden?

Provided my logic here is correct, what should $P(D)$ be here and why?

1

There are 1 best solutions below

2
On BEST ANSWER

Since you know $P(D)$ and $P(D|H)$, you can compute $P(D)$ by the Law of Total Probability:

\begin{align} P(D) = \int_{H^{*} \in \Theta}{P(D|H^{*})dP(H^{*})} \end{align}

You usually don't need to compute this value in Bayesian computations, though.