Why is a distribution called a distribution?

346 Views Asked by At

I learned in class that distribution $f$ is a continuous mapping from the set of smooth test functions $\phi$ to a real number.

Why is this mapping called a "distribution"? Can you explain to me the intuition/motivation for the name?

What is $f$ a distribution of?

2

There are 2 best solutions below

0
On

My personal guess, is that distributions were introduced around the same time historically as probability was extended into analysis territory. (I think this was as late as the early 1900s, please correct me if wrong.)

You may be aware of the Probability Density Function (PDF) for continous variables

$$pdf(t) \text{ such that } \int_{-\infty}^{\infty}pdf(\tau)d\tau = 1$$

Cumulative Distribution Function (CDF):

$$cdf(t) = \int_{-\infty}^{t}pdf(\tau)d\tau$$

Now to allow discrete distributions to be described with integrals within this framework, it should be obvious it will not be enough to consider Riemann integrable functions.

Now for a pdf to describe behaviour of 6 sided fair dice, we will need "infinitely thin" slices of probability densities with 1/6 each concentrated around integers $\{1,2,3,4,5,6\}$. This will not be doable with normal Riemann integrable function, because no such function can have the property $$\lim_{\epsilon \to 0}\int_{t-\epsilon}^{t+\epsilon}f(\tau)d\tau \neq 0$$

This would be a great reason to introduce "something" $\delta(t)$ with the property

$$\int_{-\infty}^{\infty}\delta(t)\phi(t) dt = \phi(0)$$

Where $\delta$ is the Dirac delta distribution. Because we need it to be able to join discrete probability distributions within the same framework as the continuous probability distributions.

2
On

According to Earliest Known Uses of Some of the Words of Mathematics, the term distribution for a generalised function was introduced by Schwartz in Généralisation de la notion de fonction, de dérivation, de transformation de Fourier et applications mathématiques et physiques. We read at the beginning of §1:

Les éléments sur lesquels il faut raisonner sont plus généraux que des fonctions. Ainsi $\delta(x)$ n’est pas une fonction, c’est une mesure ou distribution de masses, d’un type particulièrement simple : elle comporte une masse $+1$ placée à l’origine. Une distribution de masses $(\mu)$ est entièrement définie par la connaissance de la masse $\mu(a, b)$ contenue dans tout intervalle $(a, b)$ ; c’est un nombre réel de signe quelconque ou même un nombre complexe. $\mu(a, b)$ ne peut pas être une fonction quelconque d’intervalle, elle doit vérifier une condition qui exprime que la somme des modules des masses est finie et une condition d’additivité. $(\mu)$ permet de définir une fonction d’ensemble $A$ par $$ \mu(A) = \int_A d\mu $$ et plus généralement une fonctionnelle $\mu(\varphi)$ définie au moins pour toute fonction continue $\varphi(x)$, nulle en dehors d’un intervalle fini : $$ \mu(\varphi) = \int_{-\infty}^{+\infty} \varphi(x) d\mu \qquad \text{(intégrale de Stieltjes).} $$ [...]

of which a rough translation is

The objects we must consider are more general than functions. Thus $\delta(x)$ is not a function, but a measure or mass distribution, of a particularly simple type: it includes one mass of $+1$ placed at the origin. A mass distribution $(\mu)$ is completely determined by knowing the mass $\mu(a, b)$ contained in each $(a, b)$, which is a real number of a certain sign or even a complex number. $\mu(a, b)$ cannot be any function of intervals: it must satisfy a condition which says that the sum of the absolute values of the masses is finite and an additive. $(\mu)$ defines a function of sets $A$ by $$ \mu(A) = \int_A d\mu $$ and more generally a functional $\mu(\varphi)$ defined at least for continuous functions $\varphi(x)$ that are zero outside a finite interval: $$ \mu(\varphi) = \int_{-\infty}^{+\infty} \varphi(x) d\mu \qquad \text{(Stieltjes integral).} $$ [...]

So it's a distribution (of mass). Later on in the section, we find that

[...] Il est ensuite nécessaire de définir des distributions plus générales que des distributions de masses et qui correspondent aux « couches multiples » (couches de doublets ou dipôles et couches plus compliquées) employées dans la théorie du potentiel. [...]

(I'm not sure what "couches" refers to in this context, but still, this is approximately)

[...] It is now necessary to define distributions more general than mass distributions, which correspond to "multipoles [?]" (double-layered [?] or dipoles and more complicated layers [?]) used in potential theory. [...]

So they are "mass distributions", but allowing for generalisation (for differentiation, etc.).