Functional Analysis has never been particularly intuitive to me, but in recent years I'm starting to see how it aids Fourier Analysis, Operator Theory and PDEs, but for the life of me I don't get the fixation on the convolution operation. I know it has some applications in engineering but it dates back to the time of Laplace. How is it used in mathematics?
One example I can think of is if you have a family of good kernels/mollifiers $D_n$, $f\in L^p$ then $D_n * f$ are smooth and converge to $f$.
First of all, there's no "fixation" whatsoever. No more than there is a "fixation" with addition. Or with numbers. Or with vector spaces. Actually way way less, as convolution does not feature prominently.
In any case, convolution appears very naturally in lots of contexts. Off the top of my head, and with the caveat that I don't know/use/see convolution much:
in Probability, the distribution of the sum of two independent random variables is the convolution of the distributions;
when you multiply two polynomials, the coefficients of the products are obtained as the convolution of the coefficients;
in Group Theory, $\ell^1(G)$ is made naturally an algebra with convolution as the product. "Natural" because the canonical basis satisfies $e_{gh}=e_g*e_h$;
as mentioned by copper.hat, the Fourier transform takes convolution to pointwise product;
as mentioned in the question, convolution allows one to show that $C^\infty$ functions with compact support are dense in $L^p(\mathbb R^n)$.