The misuse of Dirac's $\delta$

1.5k Views Asked by At

In physics and engineering it is common practice to use the Dirac delta distribution to represent "densities" of discrete random variables. It is a very useful construct and you can do many things with it easily.

$$f_{\pmb x}(x)=\sum_{n=1}^{\infty}P(\pmb x = x_n) \,\delta(x-x_n) \;=\; \sum_{n=1}^{\infty}p_n \,\delta(x-x_n)$$

$$E_{f_{\pmb x}}\{\delta(g(\pmb x))\} \;=\; \sum_i \frac{f_{\pmb x}(x_i)}{{|g'(x_i)|}}\enspace, \quad \text{ with } g(x_i)=0 \text{ and } g'(x_i)\neq 0$$

$$\pmb y = g(\pmb x) \quad\Rightarrow \quad f_{\pmb y}(y) \;=\; E_{f_{\pmb y}}\{\delta(\pmb y - y)\} \;=\; E_{f _{\pmb x}}\{\delta(g(\pmb x)-y)\}$$

But the mathematicians always say it would be mathematically objectionable or even incorrect, because a density function made up of delta distributions is not continuous and not integrable. However the definition of the delta distribution precisely defines its integral. So what is the problem here?

Is there an example where the use of the Dirac delta function can lead do wrong results?

4

There are 4 best solutions below

8
On BEST ANSWER

The Dirac distribution really is a function – specifically, a functional $$ \tilde\delta : (\mathbb{R} \to \mathbb{R}) \to \mathbb{R}, \qquad \tilde\delta(f) := f(0). $$ That definition is perfectly simple and uncontroversial.

The funny thing is, nobody's actually using it this way! For a reason I find strange, physicists and also many mathematicians actually seem more suspicious about such a simple, but “higher-order” function than about a function on the real axis itself, even if it requires “infinite function values” to work.

What's actually going on with the standard definition is this: the functions $\mathbb{R} \to \mathbb{R}$ form a vector space. If you narrow it down to only functions whose square is integrable over the entire domain, you get the $L^2(\mathbb{R})$ Hilbert space.

One of the nice things in Hilbert spaces is the Riesz representation theorem. It says roughly that a Hilbert space is isomorphic to its dual space; in this case meaning, the space of linear functionals $L^2(\mathbb{R}) \to \mathbb{R}$ is isomorphic to $L^2(\mathbb{R})$ itself. IOW, any square-integrable function has a canonical correspondent functional vice versa. These corresponding pairs are always basically given by imitating the integral over the product. For instance, $g(x) = e^{-x^2/2}$ has the corresponding functional $$ \tilde g(f) = \int_\mathbb{R}\!\!\mathrm{d}x \: g(x)\cdot f(x). $$ That choice is canonical because you can reconstruct $g$ from that functional, as the unique unit-norm function which maximises the $L^2$ scalar product. (That this is possible in a Hilbert space – thanks to the completeness property – is the interesting bit about the Riesz representation theorem.)

Naïvely, we could follow from this that $\tilde\delta$ has a corresponding function $\mathbb{R}\to\mathbb{R}$. It is after all a functional on functions, and then we can as well consider it only on square-integrable ones... what's the problem?

Well, the problem is that $L^2(\mathbb{R})$ is not really just an integrability-restriction of the space of functions. It's actually a space of equivalence classes of such functions: when two functions only differ on a Lebesgue null set, they're considered the same element of $L^2(\mathbb{R})$. And that means $\tilde\delta$ isn't actually defined on $L^2(\mathbb{R})$, because if you change the function only on the point 0 you'd get a different result, but from the “same” argument. And that would be your wrong results from naïve use of $\delta$ as a “real-valued function”: if you evaluate it with functions that are tweaked at a single spot, you can get wrong results.

The reason this isn't usually an issue in physics is the “all functions are continuous” paradigm. Because while every element of $L^2(\mathbb{R})$ contains many functions, each differing only in a null set (e.g. only in discrete points), there is always at most a single continuous such function. So, $\tilde\delta$ is actually well defined as a functional $L^2(\mathbb{R}) \cap \mathcal{C}^0(\mathbb{R}) \to \mathbb{R}$. Then again, that is not a Hilbert space, but it's certainly an actual subset of one, so the physicists are doing ok.


To be precise (as Hans reminds me to be), the dual space in question is only the space of bounded linear functionals (or equivalently, continuous linear functionals, though I'd remark that continuity on functionals should not be confused with continuity on corresponding functions). So even if $\tilde\delta$ was a well-defined functional – which in fact you can make it by restricting yourself further to the $H^1$ Sobolev space, in which each equivalence-class has exactly one continuous member – you wouldn't be able to apply Riesz, because the functional would not be bounded, i.e. you would be able to construct a sequence of $L^2$ functions that have all the same $L^2$ norm but give infinitely-growing results of $\tilde\delta$.

0
On

What the mathematicians are saying is not that the Dirac delta function is not continuous or integrable which requires first the object under discussion be an $\mathbf R\to\mathbf R$ function, but that it is not even an $\mathbf R\to\mathbf R$ function. However, the Dirac delta function is rigorously defined, only not as an $\mathbf R\to\mathbf R$ function but a class of linear functionals, which is a linear function from a function space into the set of real (complex) numbers $\mathbf R$, called distribution or generalized function.

1
On

I am very happy to consider $\delta$ as a function, without having to mess with functional analysis that drives intuition away. Until I stumble upon the square $(\delta(x))^2$, which is a suspicious object: indeed, a single Internet search reveals trouble. $^{[1]}$

There is also another issue: if $\phi\colon \mathbb R^n\to \mathbb R$, I would be very happy to consider $\delta(\phi(x_1\ldots x_n))$. That's another case in which the $\delta$ "function" does not behave like a function at all.


Note [1]. At a more down-to-earth level, Note 2 in this answer shows that $\delta$ cannot be considered as an element of $L^2(\mathbb R)$, not even in a "weak" sense, so $\delta(x)^2$ should have an infinite integral, to begin with.

1
On

It might help to analogize somewhat to CS (assuming you're familiar enough with CS to understand this, of course). The "normal" use for functions is to provide an evaluation value: f(x) = something. You call f, you pass the parameter x, and you get some value returned. However, more generally/abstractly, you can think of a function as an object with an evaluation method: f.evaluate(x) = something. Depending on the context in which functions are being used, there can then be further methods: f.taylor(n), for instance, might return the nth coefficient in the Taylor expansion. If you're doing combinatorics, this might be all you're interested in: if f is a generating function, then you might not care at all what it actually evaluates to. And once you stop caring about that, then there are going to be objects that do have what you care about, but don't have what a "true" function has. For instance, if f.taylor(n) = nn, then its radius of convergence is zero, and it is therefore undefined everywhere but zero. More precisely, its evaluation is undefined everywhere but zero. Its Taylor coefficients are still perfectly well-defined.

So we can have a class that has been generalized from the traditional concept of a “function”, but now is no longer required to have something integral to that concept. Now, consider how functions are used in probability. Generally, a probability function isn’t used for its evaluation method; the PDF gives the probability “density” at a point, but the actual probability “mass” at any point is zero. It’s only over a non-zero-measure set that a PDF has non-zero probability mass. So what we really need is a CDF method: f.cdf(a,b) gives the CDF over the interval (a,b). So again we have a class that looks a lot like “functions”, and is often given in function form e.g. f(x) = e-x. As long as we’re in this abstract function-like class, there’s no problem including the delta “function”. But when we treat it as actually being a function, that can cause problems.

So what are some cases where it can cause problems? Well, obviously trying to evaluate it at 0 causes a problem. Since it’s not continuous, interchanging limits that involve the delta function can cause problems. Applying the Fundamental Theorem of Calculus causes problems (the FTC basically says that the derivative of the integral is equal to the original function). Taking the Fourier Transform does not itself cause problems, but assuming that the result will be in L2 does. More generally, one has to be more careful when doing normalization when dealing with delta functions.