Histogram convolution

1.3k Views Asked by Bumbble Comm At 06 Apr 2026 - 4:06

I have some data which I currently aggregate as histograms. This gives me a few histograms, let's say $H_0, ..., H_n$. Now I want to convolve some of the histograms in the same way PDFs, e.g., normal distributions, are convolved (that is, from the distributions of $X$ and $Y$ I want to get the distribution of $X+Y$).

So far, I haven't really been able to find much information about this, but I have two approaches:

The first is manual convolution of the histograms: To convolve two histograms, $H_a$ and $H_b$, I take every bin $h_{a,i}$ from $H_a$ and then for each bin $h_{b,j}$ I calculate a new bin $h_{new}$ with the lower edge equal to $lower(h_{a,i}) + lower(h_{b,j})$ and the upper edge equal to $upper(h_{a,i})+upper(h_{b,j})$, and the value inside the bin equal to $value(h_{a,i})*value(h_{b,j})$.

This results in $|H_a|*|H_b|$ many bins that are possibly overlapping. To resolve overlapping bins, I split them up, assuming some internal distribution inside a bin, until I have no more overlapping bins (and merge those that have the exact same lower and upper edges).

The result is then a new histogram $H_{new}$.
The second approach is to use samples of the PDFs described by $H_a$ and $H_b$ (from scipy.stats.rv_histogram.pdf) and use scipy.convolve on those samples. Especially with this approach I am not sure if it really does what I want, as I'm not sure where to sample the PDFs, and I'm not sure how this deals with the corresponding "overlapping bins" situation. Basically, I am very unclear about this way, and whether it leads to the same result as the first one.

My questions would be these:

Does the first way make sense and is a correct way of convolving two distributions described by histograms?
Does the second way make sense and is a correct way of convolving two distributions described by histograms? How exactly do I have to do this for it to work? Edit: i.e., where do I have to sample the PDF?
Is there a better way to achieve what I want, for example starting with the raw data without aggregating it in histograms?

I am currently working with python and numpy/scipy, so if there's a good solution using those that would be great. But everything else is also fine, I'm really looking for information here, even theory, further links, etc.

Original Q&A

Histogram convolution

Related Questions in STATISTICS

Related Questions in CONVOLUTION

Trending Questions

Popular # Hahtags

Popular Questions