Clarification on Dirac delta distribution

587 Views Asked by At

I am studying the Dirac delta and I am struggling with the intuition of the following two points.

  1. I understand the Dirac delta as a distribution with mass in one point. Perhaps, the limiting case of a standard normal distribution when the standard deviation approaches zero. As such, it integrates to one: $\int_{-\infty}^{\infty} \delta(x) dx = 1$. This seems a little counter-intuitive to me, since that the integral is doing something akin to computing the area under a point.

  2. Intuitively, the expected value of a random variable with the Dirac delta distribution is zero. That is because the whole mass is centred at x=0. That being said, $\delta(x)=\infty$ at x=0. How come that it does not affect the "weighting" in the expected value calculation?

3

There are 3 best solutions below

8
On

Dirac's delta is not an ordinary function, indeed it is a distribution. The "definition" you are talking about in the second part of your post i.e. $$\delta(x):=\begin{cases} 0\quad\text{if}\quad x\neq0 \\ +\infty\quad\text{if}\quad x=0\end{cases}\tag{1}$$ is not a rigorous one and instead it is a sloppy way to say that the weight is non-zero only in the point where the $\delta$ function is centered ($x=0$ in the present case). Another way to formulate that is$^1$ $$\begin{cases} \delta(x)=0\quad\text{if}\quad x\neq0\\ \int\limits_{-\infty}^{+\infty}\delta(x)dx=1\end{cases}\tag{2}$$ This will remind you of the definition you proposed as the limit of a normalized gaussian distribution. You can easily see that $(2)$ is not possible for any ordinary function. In fact, such function would be zero when $x\neq0$ as requested, thus it can only be non-zero when $x=0$, that is a zero-measure set as it consists of a single point. This is the reason why in "definition" $(1)$ delta is infinite at the point where it is centered.

Nonetheless, also $(2)$ is ill-defined, since as there is no ordinary function that satisfies such property, so even the integral of such an object is undefined.$^2$

So we understood this not just a regular real function of real variable, then what is this $\delta$ distribution? Let $V$ be the linear space of all real functions defined on the real line (with the usual operation). The (linear) functional $$\delta:V\rightarrow\mathbb{R}$$ defined by $$\delta(f)=f(0)\qquad\forall f\in V$$ is what we call Dirac's $\delta$.

As I defined $V$, the function can be any function $f:\mathbb{R}\rightarrow\mathbb{R}$ e.g. $$f(x)=\cosh(x)(\implies f(0)=1)$$

Then $$\delta(\cosh)=1$$

If we were to use the integral notation the RHS would be

$$\int_{-\infty}^{+\infty}\delta(x)\cosh(x)dx=1$$

I should also mention there are other definitions, this is the one used in distribution theory. The space where it is defined generally has some other regularity restrictions I didn't mention here.


$^1$This is the definition originally used by Paul Dirac.

$^2$ In fact, it is a matter of convenience to use the integral symbol for distributions and as described below, it is camouflaging what is actually going on.

0
On
  1. The dirac delta on $\mathbb{R}$ is a probability measure on the Borel subsets of $\mathbb{R}$ defined by $\delta(A) = I(0 \in A)$. It is true that $\delta$ is the weak limit of normal distributions $N(0, \varepsilon)$ with mean $0$ and variance $\varepsilon$ as $\varepsilon \to 0$. It is natural to say that $\delta = N(0, 0)$. What we mean by "$\delta$ integrates to 1" is that $\delta$ is a probability measure that assigns $0$ probability $1$. It is also called "the point mass at $0$".

  2. If $X$ has the dirac delta distribution, then $P(X = 0) = 1$, so $E(X) = 0$. We can also see this by computing $$E(X) = \int_{\mathbb{R}} x \,\delta(dx) = x|_{x = 0} = 0.$$ The $\delta$ distribution does not have a density, because if it did, the density would be $0$ almost everywhere, which is absurd.

0
On

Preliminary for understanding anything about the Dirac delta

An absolutely integrable function $\eta:\,\Bbb R\mapsto\Bbb R$ with $\int_{\Bbb R}\eta(x)dx=1$ is a nascent delta function. You've favoured the example $\eta(x):=\frac{1}{\sqrt{2\pi}}\exp\frac{-x^2}{2}$. Regardless of the choice, for $\epsilon>0$ define $\eta_\epsilon(x):=\frac{1}{\epsilon}\eta(\frac{x}{\epsilon})$ so$$\lim_{\epsilon\to0^+}\eta_\epsilon(x)=\left\{ \begin{array}{rl} \infty & x=0\\ 0 & x\ne0 \end{array}\right.$$(proof is an exercise). On the other hand,$$\lim_{\epsilon\to0^+}\int_{\Bbb R}\eta_\epsilon(x)f(x)dx=\lim_{\epsilon\to0^+}\int_{\Bbb R}\eta(y)f(y\epsilon)dy\stackrel{!}{=}\int_{\Bbb R}\eta(y)\lim_{\epsilon\to0^+}f(y\epsilon)dy\stackrel{!}{=}\int_{\Bbb R}\eta(y)\lim_{\epsilon\to0^+}f(0)dy=f(0),$$provided the function $f$ is "sufficiently nice" for the $\stackrel{!}{=}$s to work (these are good enough). If we defined a function $\delta(x):=\lim_{\epsilon\to0^+}\eta_\epsilon(x)$, this would imply $\int_{\Bbb R}\delta(x)f(x)dx=0$, which in general doesn't match the above calculation; we mustn't conflate $\lim_{\epsilon\to0^+}\int_{\Bbb R}\eta_\epsilon(x)f(x)dx$ with $\int_{\Bbb R}(\lim_{\epsilon\to0^+}\eta_\epsilon(x))f(x)dx$. But an object satisfying $\int_{\Bbb R}\delta(x)f(x)dx=f(0)$ is of sufficient interest we define it anyway. It is a distribution, generalized function and measure, but not a function, and must not be conflated with the function $\lim_{\epsilon\to0^+}\eta_\epsilon(x)$, whose values you've seen quoted. Incidentally, $f$ is so nice its derivatives are too, so repeated integration by parts gives $\int_{\Bbb R}\delta^{(n)}(x)f(x)dx=(-1)^nf^{(n)}(0)$.

Your question about expectations

The above addresses your first question, so let's look at your second. Any PDF can be chosen for $\eta$. In that case, $\eta_\epsilon$ is then also a PDF, of expectation$$\int_{\Bbb R}x\eta_\epsilon(x)dx=\epsilon\int_{\Bbb R}y\eta(y)dy,$$which exists for all $\epsilon>0$ provided the original $\eta$ has a mean. Since $\int_{\Bbb R}x\delta(x)dx=0$, your question boils down to whether $\delta$ should be identified with any probability distribution. We might describe it as a probability density measure for the degenerate distribution whose only possible value is $0$. One way to make this masses-are-Dirac-deltas notion more rigorous is in terms of characteristic functions. If $X$ is degenerate with value $a$,$$\Bbb Ee^{itX}=e^{ita}=\int_{\Bbb R}\delta(x-a)e^{itx}dx,$$so $\delta(x-a)$ is the density. Generalizing the inversion formula by which characteristic functions specify a probability distribution, so as to support the case where densities are measures rather than "ordinary" functions, relies upon this insight.