I've seen stated in these online notes (Link to pdf) that the relative entropy, $\mathrm D(p\| q)\equiv\sum_x p_x \log\frac{p_x}{q_x}$, can be understood as quantifying how easy it is to discriminate between two possible probability distributions using samples from them. That is, loosely quoting from the first paragraph in the above notes: suppose we are given $X_1,...,X_n$ all sampled IID from either $p$, or IID from $q$. The optimal test to decide based on the samples which one was the correct distribution, has a failure rate of $$e^{-n(\mathrm D(p\|q)+o(1))}.$$ The notes mention a "Stein's lemma", but googling I haven't found a basic source discussing these results.
What is a primary source (possibly a textbook or online notes) discussing this result and how it's derived? Or, even better, what is a direct way to arrive to this result?