Formalizing the use of Dirac delta in a PDF

440 Views Asked by At

Suppose that you have a CDF $F:\mathbb{R}_+\to\mathbb{R}_+$, with only a single indifferentiability point at $x=1$, where $\lim_{z\to 1^{-} }F(z)=\frac{1}{3}$ but $F(1)=\frac{2}{3}$. I'd like to define the PDF $f$ associated with $F$ such that $f(x)=F'(x)$ at $x\neq 1$, and $f(1)=\frac{1}{3}{\mathbf\delta}$ where ${\mathbf\delta}$ is the Dirac delta function. However, I was told that this is informal: I cannot formally define the range of $f$ to contain both $\mathbb{R}_+$ and $\{a {\mathbf \delta}: a\in \mathbb{R}_+\}$. Is my definition formal? If no, how can I make it formal? Thank you!

1

There are 1 best solutions below

3
On BEST ANSWER

The natural way to make this rigorous is to use measure theory.

A CDF is a way of describing the statistics of a (real-valued) random variable $X$, the idea being that $F(x) = \mathbb{P}\{X \leq x\}$.

Another way, which turns out to be equivalent, but is arguably more natural, is a (Borel) probability measure on $\mathbb{R}$. Precisely, a (finite, positive) Borel measure $\nu$ assigns to any given Borel set $A \subseteq \mathbb{R}$ a number $\nu(A) \in [0,\infty)$. (Think of a Borel set as a "nice" subset of $\mathbb{R}$, or one that can be built out of intervals. Intervals are Borel sets.) It is a probability measure if $\nu(\mathbb{R}) = 1$. For us, the idea is $\nu(A) = \mathbb{P}\{X \in A\}$, that is, $\nu$ is a more vivid description of $X$.

It is a theorem of Lebesgue-Stieltjes integration that there is a one-to-one correspondence between Borel probability measures and CDFs through the map $\nu \mapsto F^{\nu}$ given by $F^{\nu}(x) = \nu((-\infty,x])$. Hence anytime you work with a CDF, there is also an associated probability measure $\nu$.

You can think of $\nu$ as being the "derivative" of $F^{\nu}$ (or $F^{\nu}$ being the "anti-derivative" of $\nu$). The relevance to your question: this makes sense whether or not $F^{\nu}$ is differentiable in the usual sense.

For example, if $F^{\nu}$ is differentiable at every point in $\mathbb{R}$ and $(F^{\nu})' = f$, then $\nu$ is given (by Lebesgue integration) through the formula: \begin{equation*} \nu(A) = \int_{A} f(x) \, dx. \end{equation*} A shorthand way to write this is $\nu = f(x) \, dx$ (hence, almost literally, $\nu$ is the derivative of $F^{\nu}$).

As in your example, there is no reason $F^{\nu}$ should be differentiable. Basically, your question seems to be: what replaces the decomposition $\nu = f(x) \, dx$ in general?

The Lebesgue Decomposition Theorem says that, given a probability measure $\nu$, there is a function $f$ and a singular measure $\nu^{s}$ such that \begin{equation*} \nu = f(x) \, dx + \nu^{s}. \end{equation*}
The sum is interpreted "set-wise" (that is, pointwise) as $\nu(A) = \int_{A} f(x) \, dx + \nu^{s}(A)$. The measure $\nu^{s}$ is singular: you can think of that as meaning it does not match our intuition about densities.

In your example, $F = F^{\nu}$ for the probability measure $\nu$ given by \begin{equation*} \nu(A) = \int_{A} F'(x) \, dx + \frac{1}{3} \delta_{1}(A) \end{equation*} where $\delta_{1}$ is the measure defined by \begin{equation*} \delta_{1}(A) = \left\{ \begin{array}{r l} 1, & \text{if} \, \, 1 \in A \\ 0, & \text{otherwise} \end{array} \right. \end{equation*} We call $\delta_{1}$ the Dirac mass at $1$; for an arbitrary $x \in \mathbb{R}$, $\delta_{x}$ is defined analogously.

More often than not, $\nu^{s}$ looks something like the previous example (or, as you say, it is a sum of finitely many Dirac deltas). In most applications, there is are sequences $\{x_{n}\}_{n \in \mathbb{N}} \subseteq \mathbb{R}$ and $\{p_{n}\}_{n \in \mathbb{N}} \subseteq [0,1]$ with $\sum_{n = 1}^{\infty} p_{n} < 1$ such that \begin{equation*} \nu^{s}(A) = \sum_{n = 1}^{\infty} p_{n} \delta_{x_{n}}(A) = \sum_{x_{n} \in A} p_{n} \end{equation*} (It is worth noting that if $\nu^{s}$ is a probability measure, or $f = 0$ above, then $\sum_{n = 1}^{\infty} p_{n} = 1$ and $\nu^{s}$ is precisely the probability measure describing a discrete random variable with PMF $\{p_{n}\}_{n \in \mathbb{N}}$. Hence what we are doing unifies "continuous" and "discrete" RV and PDFs and PMFs.)

However, the singular measure $\nu^{s}$ can also have a "continuous" part: for example, it is possible that $\nu^{s}(\{x\}) = 0$ for all $x \in \mathbb{R}$ (so it has no Dirac parts), but still it cannot be represented using a density. Usually, you won't see such measures unless you are asking about long-time behavior of a stochastic process or an iterative procedure. For example, if $X_{n} \in [0,1]$ is a RV whose ternary expansion up to order $n$ is equally likely to be any string consisting of $0$'s and $2$'s, but no $1$'s, then $X_{n} \overset{d}{\to} \tilde{X}$, where $\tilde{X}$ is described by such a singular probability measure that cannot be described using Dirac masses. $\tilde{X}$ is the RV in $[0,1]$ whose ternary expansion is equally likely to have a $0$ or a $2$ at any given digit, but never has a $1$; its CDF is the "Devil's Staircase" or Cantor function. (Interestingly, the Cantor function has derivative zero at "most" points, yet it remains a CDF.)