What is the meaning of high/low KL divergance between two distribution?

34 Views Asked by At

In the available resources on the internet, it is said that KL divergence measures the similarity between two distributions. But if you use the TensorFlow built-in method to compute the KL divergence between two similar Normal distributions, it gives you a large number. why is that?

I used the formula below:

kl = tf.keras.losses.KLDivergence()  # Define KL divergence
tfd = tfp.distributions   
dist = tfd.Normal(loc=0., scale=1.)  # Build a Normal distribution
kl(dist.sample([300000]), dist.sample([300000]))  # Compute the KL between two identical distribution

output:
<tf.Tensor: shape=(), dtype=float32, numpy=772134.5>