Adding two random variables with convolution

951 Views Asked by At

I am trying to understand the purpose of convolution of two probability functions. Also when it is appropriate to use the convolve function on two independent probability distributions.

VariableOne = {
    1: 0.5,
    3: 0.5
}
VariableTwo = {
    7: 0.5,
    8: 0.5
}

Meaning that the probability of getting either 1 or 3 is 0.5 and probability of getting either 7 or 8 is 0.5.

Now looking at wikipedia I quote The probability distribution of the sum of two or more independent random variables is the convolution of their individual distributions. However, when running the following in R:

convolve(c(0.5, 0.5), rev(c(0.5, 0.5)), type="o")

I get

0.25 0.50 0.25

But I would expect to get 4 values one for each of the following combinations:

{
    1+7 : p1,
    1+8 : p2,
    3+7 : p3,
    3+8 : p4
}

Update

However it works if I have

VariableOne = {
    1: 0.4,
    2: 0.6
}

VariableTwo = {
    1: 0.3,
    2: 0.7
}

If I run the following in R:

convolve(c(0.4, 0.6), rev(c(0.3, 0.7)), type="o")

Giving

0.12 0.46 0.42

This makes sense because I can do this which gives the correct probabilities

{
    1+1     : 0.12,
    1+2, 2+1: 0.46,
    2+2     : 0.42
}

Obviously my understanding is missing something, any help appreciated.

3

There are 3 best solutions below

0
On

You have the correct values for the sum. Also, it is easy to see (multiplication rule for independent events) that each of them has probability 1/4.

A simulation in R, confirms this (only approximately, of course):

 m = 10^6
 x = sample(c(1,3), m, rep=T)
 y = sample(c(7,8), m, rep=T)
 s = x + y
 round(table(s)/m, 4)
 ## s
 ##      8      9     10     11 
 ## 0.2498 0.2499 0.2499 0.2505 

I have not used the 'convolve' function enough to be sure of an example of that. Maybe you can figure it out.

I'm not sure whether the assignment was to do the convolution by hand from theory (almost trivial), specifically to use 'convolve' in R, or to get the answer by any means possible.

Anyhow, I hope this helps.

0
On

First, we are assuming the variables are independent so that $$ P(X=x\land Y=y)=P(X=x)\,P(Y=y)\tag{1} $$ Then, for discrete distributions on the integers, $$ \begin{align} P(X+Y=n) &=\sum_{k\in\mathbb{Z}}P(X=k\land Y=n-k)\\ &=\sum_jP(X=k)\,P(Y=n-k)\tag{2} \end{align} $$ Thus, $(2)$ says that the distribution of $X+Y$ is the convolution of the distributions of $X$ and $Y$.


Let's apply this to your data: the convolution of the distributions $$ \begin{array}{c|c} k&P(\mathtt{VariableOne}+\mathtt{VariableTwo}=k)\\\hline 7&\overbrace{0.5}^1\cdot\overbrace{0.0}^6+\overbrace{0.0}^2\cdot\overbrace{0.0}^5+\overbrace{0.5}^3\cdot\overbrace{0.0}^4=0.0\\ 8&\overbrace{0.5}^1\cdot\overbrace{0.5}^7+\overbrace{0.0}^2\cdot\overbrace{0.0}^6+\overbrace{0.5}^3\cdot\overbrace{0.0}^5=0.25\\ 9&\overbrace{0.5}^1\cdot\overbrace{0.5}^8+\overbrace{0.0}^2\cdot\overbrace{0.5}^7+\overbrace{0.5}^3\cdot\overbrace{0.0}^6=0.25\\ 10&\overbrace{0.5}^1\cdot\overbrace{0.0}^9+\overbrace{0.0}^2\cdot\overbrace{0.5}^8+\overbrace{0.5}^3\cdot\overbrace{0.5}^7=0.25\\ 11&\overbrace{0.5}^1\cdot\overbrace{0.0}^{10}+\overbrace{0.0}^2\cdot\overbrace{0.0}^9+\overbrace{0.5}^3\cdot\overbrace{0.5}^8=0.25\\ 12&\overbrace{0.5}^1\cdot\overbrace{0.0}^{11}+\overbrace{0.0}^2\cdot\overbrace{0.0}^{10}+\overbrace{0.5}^3\cdot\overbrace{0.0}^9=0.0\\ \end{array} $$ The call to reflect these distributions should be
convolve(c(0.5, 0.0, 0.5), c(0.5, 0.5, 0.0), type="o")
because the first distribution has a $0.0$ at $2$.

0
On

You have two sequences given that are indexed by the integers. They are given by
$a_n = \begin{cases}1/2 & n = 1 \vee n = 3 \\ 0 & \text{else}\end{cases}$
$b_n = \begin{cases}1/2 & n = 7 \vee n = 8 \\ 0 & \text{else}\end{cases}$

Now you want to calculate the convolution of these both sequences, which means you ned to calculate the sequence $(a *b)_n = \sum \limits _{k \in \mathbb{Z}} a_k b_{n-k}$. This will yield the desired result that one would expect. For $n = 8, 9, 10, 11$ the value will be $1/4$, while all other values will be zero.

Now why does the result in R differ from this? Quite simple, R doesn't know at which position you have the values $1/2$, you have only given the information that the value $1/2$ appears twice in every sequence. Now R must guess the positions at which it has to position these values, which will be (canonically) the positions 1 and 2. The same thing holds for the second sequence, so you've actually calculated the convolution $c*c$ where
$c_n = \begin{cases}1/2 & n = 1 \vee n = 2 \\ 0 & \text{else}\end{cases}$

We can fix this problem relatively simple: We need to tell R the correct sequence. This means you need to calculate the convolution of $c(.5, 0, .5, 0, 0, 0, 0, 0)$ and $c(0, 0, 0, 0, 0, 0, .5, .5)$. This will lead to the correct result:

> convolve(c(.5, 0, .5, 0, 0, 0, 0, 0), c(0, 0, 0, 0, 0, 0, .5, .5), type="o")
 [1]  2.500000e-01  2.500000e-01  2.500000e-01  2.500000e-01  5.419008e-17
 [6] -1.152638e-17  6.661338e-17  3.028642e-17  5.599892e-17  5.181041e-17
[11] -6.804757e-19  1.801594e-17  2.960595e-17  1.982479e-17  4.113233e-17

Note that the values less than $10^{-16}$ should be 0 and they only appear because R uses the FFT to calculate the convolution - therefore there are some inaccuracies when working with floating point numbers. Also, R defines the convolution in the following way:

 If ‘r <- convolve(x, y, type = "open")’ and ‘n <- length(x)’, ‘m
 <- length(y)’, then

                   r[k] = sum(i; x[k-m+i] * y[i])                   

 where the sum is over all valid indices i, for k = 1, ..., n+m-1.

this means that the result is a shifted version of the "correct convolution". To get the proper convolution you would now need to pad the result with zeroes.