Relation between the probability of Bayesian Model Selection selecting the correct model and number of observations?

50 Views Asked by At

Suppose I have a set of candidate models $\{M_{1},M_{2},...\}$, and observed data $D$. $D$ is generated by model $M_{*}$, and we assume $M_{*}$ is in the candidate model set. Now I use Bayesian Model Selection to select a model from the candidate model set, and wish to select the correct model $M_{*}$.

The posterior probability $P(M_{i}|D)$ of a model $M_{i}$ given data D is: $$P(M_{i}|D) \propto P(M_{i})P(D|M_{i})$$ where $P(D|M_{i})=\int P(D|\theta,M_{i})P(\theta|M_{i}) d \theta $ is the marginal likelihood or model evidence of $M_{i}$. Then model with largest posterior probability is selected.

I am wondering if this can theoretically guarantee that, with some probability, the selected model is the correct model? And what's the relation between number of observations with this probability?

From Page 20 of this slides, Bayes Factor always favors correct model because $$E[\ln{\frac{P(D|M_{*})}{P(D|M_{i})}}]=\int P(D|M_{*}) \ln{\frac{P(D|M_{*})}{P(D|M_{i})}}$$ is KL divergence, so it is always greater than or equal to zero. But this does not tell me probability of the correct model being selected.

Any useful reference or possible directions to look at will be appreciated. Thank you.