I’m wondering if someone could recommend me a good graduate level textbook or resource to better learn about entropy and information theory, divergence criteria and so on. I have had two graduate courses on probability, one on measure theory and one on stochastic control. I’ve only seen very little about this in probability books, and in large deviations books, with a focus on rate functions.
Thanks!
I recommend Patrick Billingsley’s beautiful book “Ergodic theory and information”. Rigorous but very ‘readerfriendly’. The treatment of the concept of entropy is measure theoretical, not topological. It is published in 1965, so you should probably consult a university library. Btw, for both treatments of entropy and their interrelationships, you should consult Walters ‘Ergodic Theory’ (GTM). Also excellent but a bit more demanding.