how far the distribution from the uniform distribution

Question

how far the distribution from the uniform distribution

9.5k Views Asked by Bumbble Comm At 27 Mar 2026 - 3:49

I have two discrete probability distributions $P$ and $Q$, where $P=(p_1,...,p_n)$ and $Q=(q_1,...,q_n)$, in addition I have uniform distribution $U=(\frac{1}{n},...,\frac{1}{n})$.

The question is how to measure which distribution $P$ or $Q$ is the closest to uniform distribution.

I am not sure if I can use Kullback–Leibler divergence because is not a "correct" distance. Also, I don't know if I can use entropy.

Original Q&A

There are 2 best solutions below

Bumbble Comm On 13 Jul 2016 - 4:44

What prevents you from using Kullback-Leibler divergence (KL divergence) as a measure of distance from the uniform distribution? I do agree with you on the fact that KL divergence is not a true measure of "distance" because it does not satisfy (a) symmetry, and (b) triangle inequality.

Nonetheless, it can serve as a criterion for measuring how far/close a distribution is to the uniform distribution. Suppose $\mathcal{X}=\{x_{1},\ldots,x_{n}\}$ is a finite alphabet, and $P=\{p_{1},\ldots,p_{n}\}$ and $U=\{1/n,\ldots,1/n\}$ are two distributions on $\mathcal{X}$, with $U$ being the uniform distribution. Then, the KL-divergence between $P$ and $U$, denoted as $D(P||U)$, is defined to be the following quantity:

\begin{align} D(P||U)&=\sum\limits_{i=1}^{n}P(x_{i})\log_{2}\left(\frac{P(x_{i})}{U(x_{i})}\right)\\ &=\sum\limits_{i=1}^{n}p_{i}\log_{2}\left(\frac{p_{i}}{1/n}\right)\\ &=\log_{2}\left(n\right)+\sum\limits_{i=1}^{n}p_{i}\log_{2}\left({p_{i}}\right)\\ &=\log_{2}(n)-H(P), \end{align} where $H(P)=\sum\limits_{i=1}^{n}p_{i}\log_{2}\left(\frac{1}{p_{i}}\right)$ is the (Shannon) entropy of the distribution $P$. Since $D(\cdot||\cdot)\geq 0$, it is clear that the uniform distribution is the most "random" distribution that can be assigned to an alphabet since its entropy is equal to $\log_{2}(n)$ bits.

If there is another distribution $Q=\{q_{1},\ldots,q_{n}\}$ defined on $\mathcal{X}$, and if $D(P||U)<D(Q||U)$, then $H(P)>H(Q)$, and thus, $P$ is more "random" than $Q$ (which makes sense since $P$ is closer to the uniform distribution than $Q$).

Thus, closer a distribution is to the uniform distribution (closer in the sense of KL divergence), more "random" it is.

**Bumbble Comm** · Accepted Answer

Total variation distance, also known as statistical distance, is a good metric (very stringent). (Note that up to a factor $2$, it's equivalent to $\ell_1$ distance between the vectors of probabilities.) It also has a nice interpretation in terms of closeness of events.
$\ell_2$ will be much more forgiving towards small differences, and put the emphasis on outliers.
Hellinger also has some nice properties, and interpretation (although is maybe less commonly used).
Kolmogorov distance (equivalently., the $\ell_\infty$ distance between CDFs) will make sense if your domain $\{1,\dots,n\}$ has a meaningful order on it.

All of these (and more, e.g. Wasserstein/Earthmover) are valid choices -- ultimately, it'll depend on your application.

A good resource: Distances and affinities between measures, Chapter 3 of Asymptopia by Pollard. "On choosing and bounding probability metrics" by Gibbs and Su is also a recommended read.

how far the distribution from the uniform distribution

There are 2 best solutions below

Related Questions in PROBABILITY-DISTRIBUTIONS

Related Questions in DISTRIBUTION-THEORY

Related Questions in UNIFORM-DISTRIBUTION

Related Questions in DIVERGENCE-OPERATOR

Trending Questions

Popular # Hahtags

Popular Questions