Is Kullback-Leibler divergence a general measure of the difference of two probability distributions? Relation to Kolmogorov-Smirnov Test?

316 Views Asked by At

I am reading 'Deep Learning' by Goodfellow et al and in its review of Information Theory states that the Kullback-Leibler divergence can measure how different two distributions are, without any further qualification.

However, to me it seems that KL divergence measures the difference of two distributions from the point of view of the Information Value and not in general.

If I understand well, two symmetric distributions around their mean should have the same Information Value -- the amount of 'certainty' and 'surprise' from messages relating to Events generated from these two distributions should be the same given their symmetry. In the same time these two symmetric distributions would be very different, one representing a high mass of probability for low values and the other for high values.

Is my understanding correct?

And finally how Kullback_leibler divergence compares to the Kolmogorov-Smirnov Test?

Your advice will be appreciated.