Suppose $X$ and $Y$ are independent random variables with CDF $F$ and $G$ and nonnegative support.
If $X$ has a point mass $p$ at $0$ and otherwise some "density" $f$ (that is, $\Pr(0<X<a)=\int_0^af(x)dx$) and $Y$ has density $g$. Define $Z=X+Y$. In the context I'm reading $Z$ has a density $h$. I find this counter intuitive because I would think that the "bump" of size $p$ at $0$ from $X$ would probably contribute to some sort of bump for $Z$. Can you please provide an explanation / intuition?
Because of the uncertainty, I'm also struggling to find $h$. I think that for $z>0$, I simply have $$ h(z)=\int_0^zf(x)g(z-x)dx $$ and $h(0)=pg(0)=0$. But there are probably some errors. I have seen these links [1] [2] but I don't think they cover my situation. I would appreciate a self-contained reference (book, article, anything) if you know a good one. Thanks.
Edit: for the intuition part, I've just realized it: in order to have $Z=0$, both $X$ and $Y$ have to be $0$. The former happens with probability $p$ while the latter happens with probability $0$.
The point mass at $0$ contributes an impulse of area p to the pdf $f_X(x)$.
$$f_X(x) = p\delta(x) + f(x)$$
where f(x) is the part of $f_X(x)$ on (0,a]. When you convolve g with that impulse, you get a copy of g back multiplied by p. That adds on to the convolution of the rest of f with g since convolution is linear.
$$h(z) = \int_0^{\infty}[p\delta(x) + f(x)]g(z-x)dx$$
$$= pg(z) + \int_0^{\infty}f(x)g(z-x)dx.$$
Picture g being first flipped with respect to $0$, and then sliding to the right past the delta function. The value of g(0) will get multiplied by the delta function and integrated to produce the value pg(z) at z=0, and every value of g(x) will get scaled and become pg(z) for z = x as it slides along.
If the point mass had been some other x = a, it would have the effect of shifting g to the right by a (left for negative a), and multiplied by p.
$$\int_0^{\infty}[p\delta(x-a)g(z-x)]dx = pg(z-a)$$
I understand certain mathematicians hate the following notation, but it really helps to understand what's going on, which is why it's used in signal processing:
$$g(x)*p\delta(x) = pg(x)$$
$$g(x)*p\delta(x-a) = pg(x-a)$$
$$[p\delta(x)+f(x)]*g(x) = pg(x) + f(x)*g(x)$$
The first is an ideal transmission with scaling, and the second is an ideal delay with scaling.