Convolution of two probability distributions

1.5k Views Asked by At

I would like your help to find the correct definition of convolution of two probability distributions. I found several references on that, but very complicated, involving random variables, and more. At this stage I would like just an explanation on what the operator convolution consists of.

(similar question here but no answer)

For example, consider the following mixture of 2 CDFs $$ F(x)=\lambda G(x-\mu_1)+(1-\lambda)G(x-\mu_2) $$ where $\lambda\in [0,1]$.

Define $$ \Delta(x; \mu_1, \mu_2; \lambda)\equiv \lambda\times 1\{\mu_1\leq x\}+(1-\lambda)\times 1\{\mu_2\leq x\} $$

I found a reference claiming $$ F(\cdot)=G(\cdot)*\Delta(\cdot; \mu_1, \mu_2; \lambda) $$ where $*$ is the convolution symbol.

Which operation is $*$ performing?

2

There are 2 best solutions below

0
On BEST ANSWER

$\require{begingroup}\begingroup\renewcommand{\dd}[1]{\,\mathrm{d}#1}$There's no page 286 in the project Euclid paper, I think you mean page 226.

tl;dr This is just a case of sloppy language/notation.

The authors use the notion of convolution just as a highbrow way to shift $G(x)$ the base CDF to $G(x - \mu_j)$, and this really has nothing to with probability (the usual addition of independent random variables).

With $G$ being zero-symmetric as in the paper, let me use a new notation $S_j$ for the Dirac delta function $S_{j}(z)= \delta(z - \mu_j)$. This is a peak of mass $1$ at $\mu_j~$, where the arguement $z - \mu_j$ vanishes (is zero).

The shift of $G$ is done by the convolution ($S$ stands for shift)

\begin{align} (G * S_j)(x) &= \int_{t = -\infty}^{\infty}G(t)\, S(x - t) \dd{t} & &\text{, the usual definition of convolution} \\ &= \int_{t = -\infty}^{\infty}G(t)\, \delta\bigl(x - t - \mu_j\bigr) \dd{t} &&\text{, just definition of $S$} \\ &= \int_{t = -\infty}^{\infty}G(t)\, \delta\Bigl(- \bigl(t - (x - \mu_j) \bigr) \Bigr) \dd{t} &&\text{, find which $t$ makes the whole argument vanish}\\ &= G(x - \mu_j) \end{align}

As shown here, what the author of the papers are talking about is the convolution of $G$ with Dirac delta (a peak), NOT the convolution of $G$ with the "distribution function (CDF) $\Delta_k$" that is a (convex combo of) step function.

Basically they got sloppy in the language and started doing things "verbally". They also made the unfortunate choice of notation with $\delta_{\mu_j}$ to represent the step functions (my $S_j$ is equivalent to their $\delta_{\mu_j}$), which in itself is okay but totally confusing when coupled with the mathematically erroneous of expression $G * \Delta_k$.

In fact throughout the entire paper there's no place when they actually carry out a calculation of convolution explicitly. The properties of convolution they used are independent to what $\Delta_k$ actually is, therefore the error stays implicit.

I bet if you ask them "how does $G$ convolving with another CDF ($\Delta_k$) result in a yet another legit CDF", they'll respond: "oh, of course it's understood as convolution with the density for that distribution function $\Delta_k$. This is trivial and you should know that."

$\endgroup$

0
On

Maybe this helps.

One way to think about measures (and probability measures) is that they assign numbers to sets, another is that they are what you integrate functions against. So the measure $\mu$ assigns the number $\mu(A)$ to the set $A$ and the number $\int f(x) d\mu(x)$ to the function $f$. In probability theory these correspond to the probability of events and to the expectations of random variables.

So given two measures $\mu$ and $\nu$ on something like the reals or on a vector space, for which "$+$" has a meaning, their convolution $\sigma$, written $\sigma=\mu*\nu$, is given by the formula $$\sigma(A)=(\mu\times\nu)(\{(x,y):x+y\in A\}),$$ assigning to the set $A$ the product measure of a certain 2-dimensional set constructed out of $A$. Of course, to understand this properly you have to know what taking the product measure $\mu\times\nu$ means.

Equivalently, $\sigma$ integrates functions by $$\int f(x)d\sigma(x) = \int\int f(x+y) d\mu(x)d\nu(y).$$ Of course to understand this properly you have to know what iterated integrals are.

That's all there is to it, except for notational variations, and technical details about measurability and so on.