Where is the wild use of the Dirac delta function in physics justfied?

Question

Where is the wild use of the Dirac delta function in physics justfied?

2.9k Views Asked by Bumbble Comm At 29 Mar 2026 - 12:41

Wikipedia has a wild article about the Dirac delta function. Are the things listed correct? Or is there no proof that they are correct? For my master thesis I want to refer to rigorous proofs of these properties if they exist. The problem is that Wikipedia's list of references is meager and in almost every appropriate place, the references are missing. To give you a taste, some properties Wikipedia lists are:

Fourier transform of delta function,
delta function composition with another function
translations of delta function,
delta function is an even function
the property $\delta(ax) = \delta(x)/|a|$
algebraic properties
integration by parts of integrals containing delta function,
distributional derivatives

I looked at a few texts, but they were not relevant for two reasons, i.e. Griffel - modern functional analysis, because the space of functions were too small (test functions with compact support). In physics, the convolving function (not the generalized function) is usually any function on $\mathbb{R}^d$, and therefore I am interested in a large a space as possible. And second, they didn't refer to anywhere near all these properties.

Is there a math book written by a mathematician (not a physicist) which treats much of the above rigorously? Alternatively, if you can justify that the above properties are just physics (not math) sufficiently well, then I can let it go and get on with my life. Either is appreciated.

Original Q&A

There are 5 best solutions below

**Bumbble Comm** · Answer 1 · 2016-11-26 20:24:06

Associated to a function $f$ on a domain $D$ there is a linear operator given by $$g \mapsto \int_D f(x) \, g(x) \, dx$$ If we have a point $0 \in D$ then there is also a linear operator given by $$g \mapsto g(0)$$ and in many ways this behaves very much like a linear operator of the previous kind. For one thing, if you take a sequence of compact domains $C_i \to \{0\}$ and consider the "average value of $g$ on $C_i$" linear operator $$g \mapsto \int_D \frac{1_{C_i}(x)}{{\rm vol}(C_i)} g(x) \, dx$$ associated to the normalized indicator function $$f(x) = \frac{1_{C_i}(x)}{{\rm vol}(C_i)}$$ then this should obviously converge to the operator $g \mapsto g(0)$, at least assuming that things are set up so that convergence works properly. So we can imagine the linear operator $g \mapsto g(0)$ being associated to a "generalized function" $\delta(x)$, so that

$$``\int_D \delta(x) g(x) \, dx\text{''} := g(0)$$

You then just proceed to define "generalized functions" (or "distributions") to be objects having the desired properties, while in the background you're really just replacing the notion of a function $f$ with the associated linear operator [1] $$g \mapsto \int_d f(x) \, g(x) \, dx$$

That's really everything you need to know. Everything else just comes down to picking exactly what context you want to work in and choosing the things that make sense there -- if you want to use a larger space of test functions, you just have to restrict the class of functions $f$ you allow yourself to consider. But this just has to do with the functions (or "functions") that $\delta$ is going to sit alongside; $\delta$ itself works under pretty much any circumstances, since it doesn't require any notion of convergence to define.

UPDATE: Knowing the above, the proofs of most of the statements listed in the question are routine calculations. You can find the definitions of all these things in ay fuctional analysis text and simply plug in the dirac delta. For instance, by definition the Fourier transform of a function is

$$\hat{f}(s) = \int_{-\infty}^\infty f(x) e^{-2 \pi i x s} \, ds$$

If we regard a function $f$ as corresponding to linear operators $F$ where

$$F(g) := \int_{-\infty}^\infty f(x) g(x) \, dx$$

This leads us to define

$$\hat{f}(s) := F(e^{-2 \pi i x s})$$

where "f" can be anything we associate a linear operator $F$ to. Remembering that $\delta$ is just a formal symbol corresponding to the linear operator $L(g) := g(0)$, we have

$$\hat{\delta}(s) = L(e^{2 \pi i x s}) = 1$$

Similarly, if $f$ is a differentiable function then we can consider the linear operator associated to $f'$,

$$g \mapsto \int_{-\infty}^{\infty} f'(x) \, g(x) \, dx = - \int_{-\infty}^{\infty} f(x) g'(x) \, dx$$

where the equality follows from integrating by parts, using the fact that we're necessarily working in some context where $\lim_{x \to \pm \infty} f(x) g(x) = 0$. So the linear operator associated to $f'$ is

$$g \mapsto - \int_{-\infty}^{\infty} f(x) g'(x) = - F(g')$$

so we choose to take this as the definition of the derivative of something we can associate a linear operator to. In the case of the dirac delta function, $\delta'$ denotes the thing that associates to the linear operator $g \mapsto g'(0)$.

[1] If you prefer measure theory to functional analysis, you might instead think of replacing the function $f(x)$ with the measure $\mu(x) = f(x) \, dx$. Then the $\delta$ "function" is merely a formal notation such that $\delta(x) \, dx$ denotes a point mass measure centered at zero. It amounts to the same thing, since ultimately what you do with a measure is integrate something with respect to it.

**Bumbble Comm** · Answer 2 · 2016-12-02 16:53:23

Distribution Theory and Transform Analysis by A. H. Zemanian is a good book, which should satisfy your needs. Its intended audience is graduate students in engineering and science, hence its more readable for non mathematicians than most math books.

**Bumbble Comm** · Answer 3 · 2016-12-06 01:25:11

The term you're looking for is distribution theory. In the language of distributions, it is extremely simple to make the Dirac delta "function" rigorous, and to prove the aforementioned properties.

Here's the basic notion of a distribution:

A distribution is a continuous linear map from a set of nice functions (called "test functions") to $\mathbb{R}$.

Notice, by the way, that this means distributions are actually honest-to-satan functions. However, they're functions that eat other functions, which makes them somewhat different from, say, functions on the real line. For one thing, it's probably not immediately clear how to define a "derivative" or anything else. Once we look at the details, we'll find a way around this pretty quickly.

When we pick different sets of test functions, we get different notions of "distribution." To begin with, let's choose our space of test functions $D$ be the set of infinitely differentiable functions $\mathbb{R}^d \to \mathbb{R}$ that have compact support (that is, we require the functions to be zero except on some compact set). We need some topology on $D$ in order to make sense of the term "continuous." (If you're not familiar with topologies and convergence, skip the next line for now.) The topology on $D$ is usually given by specifying what convergence means on $D$: we will say that a sequence of elements $\varphi_k$ in $D$ converges to $\varphi$ as $k \to \infty$ if and only if every derivative of $\varphi_k$ converges uniformly to the corresponding derivative of $\varphi$ and all the $\varphi_k$ have supports contained in a common compact set.

An example of a distribution is the map $D \to \mathbb{R}$ given by $$\varphi \mapsto \varphi(0)$$ You can check that this is a continuous linear map. This map is the Dirac delta "function" $\delta$.

Another example: Say we have a locally integrable function $f: \mathbb{R}^d \to \mathbb{R}$. Then we can define another distribution $$\varphi \mapsto \int f(x) \varphi(x) dx$$ Now this is linear in $\phi$, and is continuous. I'll write $(f, \varphi)$ for this distribution.

Perhaps somewhat confusingly, when $F$ is a distribution (not a locally integrable function like $f$) we often use the conflicting notation $(F, \varphi)$ to indicate some distribution applied to $\varphi$.

Keep in mind that the first example above cannot be written in the form of the second example, i.e. as integration against a locally-integrable function, but is nonetheless the notation that is often used, particularly in physics: $\int \delta(x) f(x) dx = f(0)$. This is a pretty common sleight of hand: we pretend that distributions are given by integrating against a nice function even though not all distributions can be written this way.

Now we want to define a notion of "derivative" for distributions. Since a distribution is a function from a space of functions to the real numbers, it's not immediately clear how to do this. Let's try that aforementioned sleight-of-hand: consider the distributions of the form $\varphi \mapsto \int f(x) \phi(x) dx$ for some locally-integrable $f$.

From the usual integration by parts formula from ordinary calculus,

$$\int \partial_x^{\alpha} f(x) \varphi(x) dx = - \int f(x) \partial_x^{\alpha} \varphi(x) dx$$

(Note that the usual boundary terms in the integration-by-parts formula go away because $\varphi$ has compact support.)

To put this back into the notation from above: $(\partial_x^{\alpha} f(x), \varphi) = - (f(x), \partial_x^{\alpha} \varphi)$. So this suggests a way to define "differentiation:" let's use this notation as a definition.

That is, for any distribution $F\colon D \to \mathbb{R}$, we define the distributional derivative $\partial_x^{\alpha} F$ by $$(\partial_x^{\alpha} F, \varphi): = -(F, \partial_x^{\alpha} \varphi)$$

For example, let's consider the distribution given by (integrating against) the Heaviside function (we're taking $\mathbb{R}^d$ in the definition of $D$ to be $\mathbb{R}^1$):

$$ H(x) = \begin{cases} 1 &(x >0)\\ 0 &(x\leq 0) \end{cases} $$

Like in the second example, the distribution defined by $H$ is $(H, \varphi) = \int H(x) \varphi(x) dx$. As an exercise compute the derivative of this from the definition (the answer is at the bottom).

So to recap: A distribution is a continuous linear map from a set of nice functions (called "test functions") to $\mathbb{R}$. A dirty trick we will use again and again in distribution theory is to systematically confuse a function $f$ and the distribution given by integrating against it. Using this trick, we can use relatively basic mathematics to understand what certain notions like integration ought to mean for distributions, and then take this to be the definition. I've shown how to do this with (partial) derivatives; you can do the same with convolutions, adjoints, and more.

The above is just meant to give you a small flavor of the subject, so I won't go any further, and most good analysis texts should have more details if you seek them. A readable source (though not one I personally favor) is Stein and Shakarchi's Functional Analysis, Chapter 3.

Answer: The distributional derivative of this is:

$$(H', \varphi) = - (H, \varphi') = -\int H(x) \varphi'(x) dx = - \int_{0}^\infty \varphi'(x) dx = \varphi(0) - \lim_{\alpha \to \infty} \varphi(\alpha) = \varphi(0)$$

Notice that the Dirac delta "function" (distribution) applied to $\varphi$ gives precisely the same thing! (Hence the common confusing claim in intro physics classes: "the Dirac delta is just the derivative of the Heaviside function.")

**Bumbble Comm** · Answer 4 · 2016-12-11 19:17:36

Disclaimer. Not meant as a full answer that covers all of the concerns. Just a few aspects.

As the OP says, Wikipedia has a wild article about the Dirac delta "function". Interestingly enough, I think that there are a few good things in that wild article. The first good thing is the picture right on top:

It may be a problem analytically, but when seen from a purely geometrical viewpoint, there is no problem at all: the Dirac delta is the union of the $x$ - axis and the positive part of the $y$ - axis.
More precisely, it is the set $$ \{(x,y)\in\mathbb{R}^2|((y=0)\land(x\ne 0))\lor((x=0)\land(y>0))\} $$ Apart from the fact that (half)lines in Euclidean geometry cannot have an area, while the Dirac delta has one $=1$.

It's typical that the Dutch Diracdelta Wikipedia has an aditional section about Approximations with test functions. There are two nice GIF animations in the article showing how it works,

with a Gauss function : $\large \delta_\sigma(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-(x/\sigma)^2/2}$
with a Sinc function : $\large \delta_\sigma(x) = \frac{\sin(x/\sigma)}{\pi.x} = \frac{\sin(x/\sigma)}{\sigma.\pi.(x/\sigma)}$

I find that the absence of a section about Dirac delta test functions in the English version of the Wikipedia is an omission. Therefore I've collected nine of these in a separate web page:

Dirac Delta Test Functions

The secret is in scaling (with $\sigma$). Let $T(x)$ be one of the Test functions. Then quite in general we have, for any approximation of the Dirac delta with such a Test function ($\sigma > 0$) : $$ \delta_\sigma(x) = \frac{1}{\sigma}T\left(\frac{x}{\sigma}\right) \\ \int_{-\infty}^{+\infty}\delta_\sigma(x)\, dx = \int_{-\infty}^{+\infty} T\left(\frac{x}{\sigma}\right) d\left(\frac{x}{\sigma}\right) = \int_{-\infty}^{+\infty} T(x)\, dx = 1 $$ According to a sloppy definition, maybe used by some physicists, we have: $$ \delta(x) = \lim_{\sigma\to 0} \delta_\sigma(x) = \lim_{\sigma\to 0} \left[\frac{1}{\sigma}T\left(\frac{x}{\sigma}\right)\right] = 0 \quad \mbox{for} \quad x \ne 0 $$ Sloppy because, upon inspection, this limit covers only part of the geometrical representation, namely: $$ \{(x,y)|(y=0)\land(x\ne 0)\} \quad \mbox{but not} \quad \{(x,y)|(x=0)\land(y>0)\} $$ In order to cover the half $y$-axis case, we might need the inverse of the test function: $$ y = \frac{1}{\sigma}T\left(\frac{x}{\sigma}\right) \quad \Longrightarrow \quad x = \sigma.T^{-1}(y.\sigma) $$ And another limit, expressing that the upper $y$ - axis is approximated as closely as we want: $$ \lim_{\sigma\to 0} \left[\sigma.T^{-1}(y.\sigma)\right] = 0 $$ Example. Take test function number (5.), which is the Cauchy distribution: $$ T(x) = \frac{1/\pi}{1+x^2} \quad \Longrightarrow \quad \delta_\sigma(x) = \frac{1}{\sigma}T\left(\frac{x}{\sigma}\right) = \frac{1/(\pi\sigma)}{1+(x/\sigma)^2} $$ The inverse function (two branches) is found in a few steps: $$ y = \frac{1/(\pi\sigma)}{1+(x/\sigma)^2} \\ \frac{1}{\pi\sigma.y} = 1+(x/\sigma)^2 \\ x = \pm\sigma\sqrt{\frac{1}{\pi\sigma.y}-1} $$ For the sake of completeness: $$ \delta_\sigma^{-1}(y>0) = \begin{cases} \pm\sigma\sqrt{1/(\pi\sigma.y)-1} & \mbox{for} & y \le 1/(\pi\sigma) \\ 0 & \mbox{for} & y > 1/(\pi\sigma) \end{cases} $$ It is clear that for $x\ne 0$ : $$ \lim_{\sigma\to 0} \delta_\sigma(x) = \lim_{\sigma\to 0} \frac{1/(\pi\sigma)}{1+(x/\sigma)^2} = \frac{\sigma/\pi}{\sigma^2+x^2} = 0 $$ On the other hand, for $\,0 < y \le 1/(\pi\sigma)$ : $$ \lim_{\sigma\to 0} \delta^{-1}_\sigma(y) = \lim_{\sigma\to 0} \pm\sigma\sqrt{\frac{1}{\pi\sigma.y}-1} = \lim_{\sigma\to 0} \pm\sqrt{\frac{\sigma}{\pi.y}-\sigma^2} = 0 $$ Thus, in the limit, the Dirac delta is indeed equal to the geometry that is represented by the set $\{(x,y)\in\mathbb{R}^2|((y=0)\land(x\ne 0))\lor((x=0)\land(y>0))\}$ .

**Bumbble Comm** · Answer 5 · 2017-05-06 16:35:39

To answer Is there a math book written by a mathematician (not a physicist) which treats much of the above rigorously?

The following references are my favorites:

a) "Mathematics for the Physical Sciences", Laurent Schwartz;

b) "Generalized Functions vol 1", I.M. Gelfand, G. E. Shilov.

These are classics and primary sources in the areas of generalized functions, b), and distributions, a). Generalized functions and distributions are the same thing, see wiki 'generalized functions'. Both are rigorous math books, and are very readable. They both have much information on the Dirac Delta distribution (aka 'Delta Function').

Schwartz is credited with originating the 'theory of distributions' which is also the title of his original book (in French only). "Mathematics for the Physical Sciences" contains much of the material in that book.

Gelfand, a master mathematician, goes into even more detail. A substantial portion of vol 1 is devoted to the Dirac Distribution.

To answer Alternatively, if you can justify that the above properties are just physics (not math) ...

The properties are math not physics.

Where is the wild use of the Dirac delta function in physics justfied?

There are 5 best solutions below

Dirac Delta Test Functions

Related Questions in FUNCTIONAL-ANALYSIS

Related Questions in FOURIER-ANALYSIS

Related Questions in MATHEMATICAL-PHYSICS

Related Questions in DIRAC-DELTA

Trending Questions

Popular # Hahtags

Popular Questions