Sum of two truncated gaussian

6.3k Views Asked by At

What is the CDF and the PDF (or approximation) of the sum of two independent truncated gaussian random variable $X \sim TN_x(\mu_x,\sigma_x;a_x,b_x)$ and $Y \sim TN_y(\mu_y,\sigma_y;a_y,b_y)$?

$TN(\mu,\sigma;a,b)$ denotes the truncated normal distribution, where a and b are the the lower and upper bounds of the truncation, respectively.

2

There are 2 best solutions below

4
On

OK, I believe the variables are independent. Then let $\Phi(x,\mu,\sigma)$ be normal CDF and $\rho(x,\mu,\sigma)$ it's density $$ \rho(x,\mu,\sigma)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}} $$ Thus $$ P_x(x)=(x\le a_x)\Phi_(a_x,\mu_x,\sigma_x)+(a_x < x < b_x)\int_{a_x}^x \rho(x,\mu_x,\sigma_x) dx+(x>b_x)(1-\Phi(b_x,\mu_x,\sigma_x)) $$ $$ P_y(y)=(y\le a_y)\Phi_(a_y,\mu_y,\sigma_y)+(a_y < y < b_y)\int_{a_y}^y \rho(y,\mu_y,\sigma_y) dy+(y>b_y)(1-\Phi(b_y,\mu_y,\sigma_y)) $$ So the mean operator $E$ will look as follows: $$ E(f(X))=\Phi(a_x,\mu_x,\sigma_x)f(a_x)+\int_{a_x}^{b_x} f(x)\rho(x,\mu_x,\sigma_x) dx+f(b_x)(1-\Phi(b_x,\mu_x,\sigma_x)) $$ Now using the formula $P_z(z)=E(P_y(z-X))$ and defining $f(x)=P_y(z-x)$ we can write the following $$ P_z(z)=\Phi(a_x,\mu_x,\sigma_x)P_y(z-a_x)+\int_{a_x}^{b_x} P_y(z-x)\rho(x,\mu_x,\sigma_x) dx+P_y(z-b_x)(1-\Phi(b_x,\mu_x,\sigma_x)) $$ May be going through different inequalities you can simplify this expression, plugging the expression ofr $P_y$ into the last formula.

EDIT As was noted by wolfies I wrote $P_{x,y}$ for censored variables rather truncated. Thank you for catching me. We can rewrite truncated distribution even simpler: $$ P_y(y)=\frac{1}{\Phi(b_y,\mu_y,\sigma_y)-\Phi(a_y,\mu_y,\sigma_y)}\int_{\max(y,a_y)}^{\min(y,b_y)} \rho(y,\mu_y,\sigma_y) dy =(y>a_y) \frac{\Phi(\min(y,b_y),\mu_y,\sigma_y)-\Phi(a_y,\mu_y,\sigma_y)}{\Phi(b_y,\mu_y,\sigma_y)-\Phi(a_y,\mu_y,\sigma_y)} $$ and $$ E(f(X))=\frac{1}{\Phi(b_x,\mu_x,\sigma_x)-\Phi(a_x,\mu_x,\sigma_x)}\int_{a_x}^{b_x} f(x)\rho(x,\mu_x,\sigma_x) dx $$ So $$ P_z(z)=\frac{1}{\Phi(b_x,\mu_x,\sigma_x)-\Phi(a_x,\mu_x,\sigma_x)}\int_{a_x}^{b_x} P_y(z-x)\rho(x,\mu_x,\sigma_x) dx $$

And finally we have very similar formula for density $$ \rho_z(z)=\frac{\int_{a_x}^{b_x} (a_y<z-x<b_y)\rho(z-x,\mu_y,\sigma_y)\rho(x,\mu_x,\sigma_x) dx}{(\Phi(b_x,\mu_x,\sigma_x)-\Phi(a_x,\mu_x,\sigma_x))(\Phi(b_y,\mu_y,\sigma_y)-\Phi(a_y,\mu_y,\sigma_y))} $$

It can be simlified further $$ \rho_z(z)=\frac{\int_{\max(a_x,z-b_y)}^{\min(b_x,z-a_y)} \rho(z-x,\mu_y,\sigma_y)\rho(x,\mu_x,\sigma_x) dx}{(\Phi(b_x,\mu_x,\sigma_x)-\Phi(a_x,\mu_x,\sigma_x))(\Phi(b_y,\mu_y,\sigma_y)-\Phi(a_y,\mu_y,\sigma_y))} $$ The last result almost surely can be completed in $\Phi$ terms.

0
On

This is not an easy problem to obtain a closed-form solution to. As always, there are a number of different approaches, but unfortunately many of them seem to yield intractable outcomes. The approach taken here is to proceed manually step-by-step using the Method of Transformations, aided by using a computer algebra system to do the nitty-gritties where that is helpful.

Given: Let:

  • $X \sim N(\mu_1, \sigma_1^2)$ be doubly truncated (below and above) at $(a_1, b_1)$, where $0<a_1<b_1$, and

  • $Y \sim N(\mu_2, \sigma_2^2)$ be doubly truncated (below and above) at $(a_2, b_2)$, where $0<a_2<b_2$.

Here is an illustrative plot of the doubly truncated Normal pdf, given different parameter values:

enter image description here

Solution

By virtue of independence, the joint pdf of $(X,Y)$, say $f(x,y)$ is the product of the individual pdf's:

enter image description here

where Erf[.] denotes the error function.

Step 1: Let $Z=X+Y$ and $W=Y$. Then, using the Method of Transformations, the joint pdf of $(Z,W)$, say $g(z,w)$, is given by:

enter image description here

where:

  • Transform is a function from the mathStatica package for Mathematica, which automates the calculation of the transformation and required Jacobian.

  • the transformation $(Z=X+Y, W=Y)$ induces dependency in the domain of support between $Z$ and $W$. In particular, since $X$ is bounded by $(a_1,b_1)$, it follows that $Z=X+Y$ is bounded by $(a_1+W, b_1+W)$. This dependency is captured using the Boole statement in the line above which acts as an indicator function. Because the dependency has been captured into pdf g itself, we can enter the domain of support for pdf g in standard rectangular fashion as:

enter image description here

Step 2: Given the joint pdf $g(z,w)$ just derived, the marginal pdf of $Z$ is:

enter image description here

which yields the 4-part piecewise solution:

enter image description here

... defined subject to the domain of support: $a_1 + a_2 < Z < b_1 + b_2$.

The solution sol appears very small on screen: to view it larger, either open the image in a new window, or click here.

The Marginal computation is far from simple: it takes about 10 minutes of pure computing time to solve on the new R2-D2 Mac Pro.

All done.

Illustrative Plot

Here is a plot of the solution pdf of $Z = X + Y$, for the same parameter values used in the first diagram:

  • $(\mu_1 = 5, \sigma_1 = 2, a_1 = 5, b_1 = 8)$
  • $(\mu_2 = 1, \sigma_2 = 4, a_2 = 7, b_2 = 9)$

enter image description here

Monte Carlo Check

It is always good idea to check symbolic work with a quick Monte Carlo comparison, just to make sure that no mistakes have crept in. Here is a comparison of the empirical pdf (Monte Carlo simulation of the sum of the doubly truncated Normals) [the squiggly blue curve] plotted together with the exact theoretical pdf (sol) derived above [ red dashed curve ].

enter image description here

Looks fine!

Notes

  1. The Marginal function used above is also from the mathStatica package for Mathematica. As disclosure, I should add that I am one of the authors of the software used.