Let us consider the following two probability distributions
P Q
0.01 0.002
0.02 0.004
0.03 0.006
0.04 0.008
0.05 0.01
0.06 0.012
0.07 0.014
0.08 0.016
0.64 0.928
I have have calculated the Kullback-Leibler divergence which is equal $0.492820258$, I want to know in general what does this number shows me? In general Kullback divergence shows me how far is one probability distribution from another right? It is similar to entropy terminology, but in terms of number, how can I say let says result is 0.49 can i say that approximately one distribution is far from another by by 50%? or? thanks in advance
The KL divergence is not a dimensionless number, it has an unit (which depends on the base of the logarithm used), you must specify it unless it's implied by the context. I guess you used base
2, hence the unit it bits.The KL divergence (or "distance") is not symmetric: $D(p || q) \ne D(q|| p)$, so , again, you must specify which one you computed.
In our case $D(p||q)=0.49282$ bits.
Regarding the numerical significance, you could first compute the entropies. In our case $H(p)=1.9486...$ and $H(q)=0.5745...$ (always in bits). In terms of source encoding (first Shannon theorem), this says that source $p$ can be optimally encoded with $1.9486...$ bits per symbol. Now, if we encode source $p$ assumming (wrongly) that its true distribution were that of $q$, we'd get an average code length of $2.44142...$ bits per symbol (you can do the math). The "excess" error, the ineficciency that arises because of assuming a wrong distribution, is quantified by the KL divergence: $2.44142-1.9486=0.49282$ bits.