Origin of the notation for statistical divergence

2k Views Asked by Bumbble Comm At 05 Apr 2026 - 5:13

The unusual notation $D(P||Q)$ seems to be universally used for statistical divergences (e.g. KL divergence). What is the origin of this notation, and do the double bars (pipe symbols) have any significance in statistics/probability or information theory?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 13 Jan 2016 - 11:24 BEST ANSWER

Kullback and Leibler did not originate the $D(P||Q)$ notation. In their paper "On Information and Sufficiency", Ann. Math. Stat, 22(1):79-86, 1951, they use $$I_{1:2}(E)=\frac{1}{\mu_1(E)}\int_{E} \,d\mu_1(x) \log \frac{f_1(x)}{f_2(x)},$$ stated for a set $E\subseteq S$ of the sample space $S.$ They attribute this notation to Halmos and Savage.

Shannon doesn't seem to use it either, as far as I can tell by a cursory look. Maybe an information theorist (Cover? Wolfowitz(?), Gallager(?, but in his classic book it only appears as a problem, for the discrete case, and without a symbol, just as a sum!), Wyner(?),Csiszar?) later on adopted the notation.

The two vertical bars may be there to stop people think it is a conditional distribution.

Bumbble Comm On 19 Jan 2016 - 1:19

Double bars are not universally acknowledged notation for statistical distances. Instead of double bars, coma can be preferred. Even in information theory society, double bars are not a must to be used to indicate the distance between the probability measures. However, it is quite likely that such creative notations might have come to existence by some exotic information theory guys.

Origin of the notation for statistical divergence

There are 2 best solutions below

Related Questions in PROBABILITY-THEORY

Related Questions in NOTATION

Related Questions in INFORMATION-THEORY

Trending Questions

Popular # Hahtags

Popular Questions