I encountered convolution in signal process and CNN neural network. I had a hard time understanding convolution's mathematical properties and their connections to the applications.
My questions are:
Where is convolution used in the field of mathematics?
Why convolution was massively used in signal process? (For a more specific reason, why it can enhance signal and reduce noise? Which propriety was used here and is there any written proof?)
What's the role of convolution in CNN neural network? How come a deep learning network can be built by simple convolution?
Some related pages were listed below, but I'm not quiet satisfied by the answers: Understanding convolution , understanding the convolution in signals and systems
Convolution comes up all over, many times when doing signal analysis or solving initial value problems for PDEs (you might convolve with a Green's function).
In terms of signal analysis, convolution corresponds to multiplication in the Fourier domain. Suppose that you want to get rid of high frequency content since it might mostly be noise. To do this, you would take your time signal, do its Fourier transform, and then multiple by a low pass filter of some sort that has an approximate cutoff around the frequency you want. You'd likely want some sort of Gaussian. To see what the resulting time domain signal looks like, you do an inverse Fourier transform.
If you want to reduce the amount of computations you have to do, you can just do a convolution because the Fourier transform has the following property
$$\mathcal{F} (f*g) = \mathcal F (f) \mathcal{F}(g). $$
So if you multiply by a function in the Fourier domain, it's like doing a convolution in the original domain. There is a caveat: you have to know the Fourier transform of your filter. This is why you might pick a Gaussian based filter: the Gaussian is invariant under the Fourier transform.
As for how this applies to neural networks, I can't say exactly as it is a field I only have passing interest in and no formal studies.