I have a decent understanding of most of the general idea of relative entropy(Kullback-Liebler Divergence). However I would like help clarifying the following from a set of notes:
" Relative entropy measures good a candidate model $M_k$ is as an approximation for the typically unknown true data generating model $M_{\text{true}}$".
My question is: How can we measure how well a model approximates the true model if the true model is unknown ?
Edit: Is it that we collect samples from the true distribution, calculate an estimate of the true distribution(the empirical distribution) and then to compute the expectations with respect to the true distribution in fact compute averages using the empirical distribution