The convolution formula goes as follows:
If the random variables $X$ and $Y$ are independent and continuous with density function $f_X$ and $f_Y$, then the density function of $Z=X+Y$ is $$f_Z(z)=\int_{\infty}^\infty f_X(x)f_Y(z-x)\,\mathrm{d}x\quad \text{for } z\in\mathbb R.$$
However, the theorem does not explicitly state that those two random variables have to be jointly continuous. Why not? My textbook assumes joint continuity to show,
$$f_Z(z)=\int_{\infty}^\infty f_{X,Y}(x,z-x) \, \mathrm{d}x,$$
which almost immediately gives us the convolution formula for independent random variables that are continuous.
So my question is: why is joint continuity not assumed in the theorem?
Later on my textbook gives the following example:
Let $X$ and $Y$ be independent random variables having, respectively, the gamma distribution with parameters $s$ and $\lambda$, and the gamma distribution with parameters $t$ and $\lambda$.
They proceed to apply the convolution formula, yet they nowhere state that those two random variables are jointly continuous...
So is it a mistake, or am I missing something here?
Joint continuity is not included as a hypothesis when independence is assumed. The reason is that if each of two random variables has a continuous distribution and they are independent, then their joint distribution is continuous.
Here, by "continuous", I mean not just that the c.d.f. is continuous, but that there is a probabilty density function (a somewhat stronger condition). That means for every measurable set $A$ you have $$ \Pr(X\in A) = \int_A f_X(x)\,dx $$ and similarly for $Y$. Independence implies \begin{align} & \Pr(X\in A\ \&\ Y\in B) = \Pr(X\in A)\Pr(Y\in B) = \int_A f_X(x)\,dx \int_B f_Y(y)\,dy \\ = {} & \Big( \text{something not depending on } y \Big)\times \int_B f_Y(y)\,dy = c\int_B f_Y(y)\,dy \\[10pt] = {} & \int_B c f_Y(y)\,dy = \int_B\left( \int_A f_X(x)\,dx \right) f_Y(y)\,dy \\[10pt] = {} & \int_B\left( \int_A f_X(x)\,dx \right) \Big( \text{something not depending on } x \Big) \,dy \\[10pt] = {} & \int_B \left( \int_A f_X(x) f_Y(y)\,dx \right) \,dy \\[10pt] = {} & \iint_{A\times B} f_X(x) f_Y(y) \,d(x,y) \quad \text{by Fubini's theorem or Tonelli's theorem.} \end{align} This works for sets of the form $A\times B$, i.e. $(x,y)$ is in that set if and only if $x\in A$ and $y\in B$. Now there's the problem of more general sets, for example $(x,y)\in C$ where $C$ is a disk in the $xy$-plane. Can one prove that $$ \Pr(X\in C) = \iint_C f_X(x) f_Y(y)\, d(x,y) \text{ ?} $$ This involves some theory of integration beyond what will fit in the tiny margin of this page (o.k. -- I mean more than I'm going to write here). But once one shows this, one conludes that $(x,y)\mapsto f_X(x) f_Y(y)$ is the joint density.