In terms of marginal density, how does one know that summing over the $x$ (or rather along the linear line) values for the joint density of $(x,z-x)$ give us the density function of $z$? More importantly, can someone explain this intuitively? (aside from proofs)
2026-04-03 17:13:26.1775236406
On
Convolution Theorem and Marginal Density Intuition.
1.9k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
2
There are 2 best solutions below
0
On
if the variables are independent, you have $$f_{XY}(x,y) = f_X(x)f_y(y).$$ Hence $$P(X + Y \le x) = \iint_{(s,t)|s+t\le x} f_X(s)f_y(t)\,ds\,dt = \int_{-\infty}^\infty\int_{-\infty}^{x - t} f_X(s)f_Y(t)\,ds\,dt$$ Now differentiate and invoke Leibnitz's rule to get $$f_{X+Y}(x) = \int_{-\infty}^\infty f_X(x-t)f_Y(t)\,dt = (f_X*f_Y)(x).$$

You are looking at the final formula and seeking intuition rather than thinking about how the formula was arrived at. Indeed, the "thought process" that your book offers as an explanation of the formula leaves much to be desired (e.g. it puts joint density in eyebrow-wiggling quotes instead of sum). Here is what I wrote in a previous answer:
Let $Z = X+Y$. For any fixed value of $z$, $$F_Z(z) = P\{Z \leq z\} = P\{X+Y \leq z\}\\ = \int\int_{x+y\le z} f_{X,Y}(x,y)\,\mathrm d(x,y) =\int_{-\infty}^{\infty}\left[ \int_{-\infty}^{z-x} f_{X,Y}(x,y)\,\mathrm dy\right]\,\mathrm dx$$ and so, using the rule for differentiating under the integral sign (see the comments following this answer if you have forgotten this) $$\begin{align*} f_Z(z) &= \frac{\partial}{\partial z}F_Z(z)\\ &= \frac{\partial}{\partial z}\int_{-\infty}^{\infty}\left[ \int_{-\infty}^{z-x} f_{X,Y}(x,y)\,\mathrm dy\right] \,\mathrm dx\\ &= \int_{-\infty}^{\infty}\frac{\partial}{\partial z}\left[ \int_{-\infty}^{z-x} f_{X,Y}(x,y)\,\mathrm dy\right]\,\mathrm dx\\ &= \int_{-\infty}^{\infty} f_{X,Y}(x,z-x)\,\mathrm dx\tag{1} \end{align*}$$ When $X$ and $Y$ are independent random variables, the joint density is the product of the marginal densities and we get the convolution formula $$f_{X+Y}(z) = \int_{-\infty}^{\infty} f_{X}(x)f_Y(z-x)\,\mathrm dx ~~ \text{for independent random variables} ~X~\text{and}~Y.\tag{2}$$
So much so for the math. What about the "thought process" that can give an intuitive feel for why the convolution integral gives the density of $Z = X+Y$? Well, remember that $f_Z(z)$ is a density (measured in units of probability mass/length) and not a probability. In fact, $P\{Z = z\} = 0$ for all $z$, while the probability of the event $A$ that $Z$ has value inside a small interval of length $\Delta z$ centered at $z$ is $$P(A) = P\left\{z - \frac{\Delta z}{2} < Z < z + \frac{\Delta z}{2}\right\} \approx f_Z(z)\Delta z.$$ Now, in terms of $(X,Y)$, we have that the event occurs $A$ occurs whenever the random point $(X,Y)$ lies inside a diagonal strip bounded by the lines $x+y = z + \frac{\Delta z}{2}$ and $x+y = z - \frac{\Delta z}{2}$. These lines cross the vertical axis at the points $(0, z + \frac{\Delta z}{2})$ and $(0, z - \frac{\Delta z}{2})$, that is, the vertical separation of the lines is $\Delta z$.
Now, we can find $P(A)$ by dividing up this diagonal strip using vertical lines spaced $\Delta x$ apart, thus creating narrow parallelograms whose vertical sides are of length $\Delta z$ and are spaced $\Delta x$ apart and whose center is $(x_i,z-x_i)$. Thus, $$P\{(X,Y) \in \text{parallelogram}\} \approx f_{X,Y}(x_i,z-x_i)\times \{\text{area of parallelogram}\} = f_{X,Y}(x_i,z-x_i) \Delta x \Delta z$$ so that $$P(A) = f_Z(z)\Delta z = \left[\sum_i f_{X,Y}(x_i,z-x_i) \Delta x\right]\times \Delta z. \tag{3}$$ The quantity in square brackets in $(3)$ is a sum (in fact a Riemann sum) and by taking the limit as the parallelograms grow ever narrower, that sum becomes an integral, that is, $(3)$ leads to $$f_Z(z)\Delta z = \int_{-\infty}^\infty f_{X,Y}(x,z-x) \, \mathrm dx\times \Delta z$$ thus proving $(1)$. It is in this sense that
The pdf of$Z$is simply the sum of the "joint density" at the points of the line$z = x+y$ as your book claims.Personally, I vastly prefer the CDF approach over such messy calculations and inelegant "thought processes"