Rigorously, why do the Laplace and Fourier transforms “reveal” the sinusoidal or exponential decomposition at their peaks?

980 Views Asked by At

I have proven to myself that if you assume some continuous function $f(x)$ to be an infinite sum of sinusoids, then the integral techniques of a Fourier series yield exactly the correct amplitude for a given frequency. With the Fourier transform, as the period is extended to the whole real line, the integration no longer, it seems to me, gives the exact amplitude for a given frequency. However, very intuitively, all the resources I have seen state that the peaks of the Fourier transform show which frequencies are present in the original function. But the graph of Fourier transform isn’t a Dirac-Delta-esque impulse spike; it is a smooth function (in the graphs I’ve seen), which suggests that the frequencies that differ ever so slightly from the “key” frequencies of the function are still present in some sense, are still shown to have some amplitude, and now this gets extremely hand wavy, which I dislike in maths. YouTube videos and Wikipedia visual demonstrations are one thing, but I appreciate rigour.

My (self intuited - any better phrasing or writing of anything I am saying would be appreciated) thoughts (on Fourier - I’m getting to Laplace):

I am working from a background of someone who can prove the validity of Fourier series, but has seen no rigour on the subject of Fourier transforms. I assume this is the rationale: take $f(x)=\sum_{n\in S}c_n\exp(2\pi in\cdot x)$, where $S$ are the frequencies that constitute $f$. Then one convention of the Fourier transform says, assuming $\xi$ is a frequency present in $f$: $$\begin{align}\hat f(\xi)&=\int_{\Bbb{R}}f(x)\exp(-2\pi ix\cdot\xi)\,dx\\&=\int_{\Bbb{R}}c_\xi\exp((2\pi i\xi-2\pi i\xi)x)\,dx+\sum_{(n\neq\xi)\in S}\int_{\Bbb{R}}c_n\exp(2\pi ix(n-\xi))\,dx\\&=\int_{\Bbb{R}}c_\xi\,dx+\sum_{(n\neq\xi)\in S}\int_{\Bbb{R}}c_n\exp(2\pi ix(n-\xi))\,dx\end{align}$$

And I’ve gathered that the latter sum of the integrals $(n\neq\xi)$ is supposed to fizzle out somehow, as the complex exponential has a non-zero argument and as such will ... rotate around and around and integrate to zero? This was hand-waved in a resource I viewed, and it struck me as very informal. A proof of this, or a better replacement statement, is one of my questions. As for the first integral, well this one won’t rotate around and fizzle out, since it has no exponential, but I fail to see how the integral doesn’t diverge, and merely just peaks instead when $\xi$ is a frequency of $f$; the amplitude associated with $\xi$, which I’ve denoted $c_\xi$, is being integrated over the whole real line and should diverge (and not finitely peak) if it is non-zero... I am very confident my error is in treating $f$ as a weighted sum, as we do in Fourier series, but I don’t see an alternative way! None of this explains either the reality that the graph of the transform still has height in the neighbourhood of some component frequency, suggesting the frequencies near to it are present... how does the transform capture this? And why is it the case? Indeed, is the transform equal to zero when evaluated at a non-component frequency, or is it just very close to zero? I’ve never been able to tell by eyeballing graphs.

I have similar issues with the Laplace transform. My entire argument as above is the same, leading to the same issues, just considering $f$ as a sum of exponentials, be they complex or real. However there is one key thing to add: a video on the Laplace transform explained that the poles of the Laplace transform show which complex $s$ are present in $f$, considered as a (I assume weighted) sum of the $\exp(st)$... again, my Fourier-series intuition of a discrete sum is creating issues that I am unsure how to resolve. But why poles! The Fourier transform (that I’ve seen) tends to just peak when it is evaluated at a component frequency - what is so different about Laplace that it has poles at the component frequencies? And what does it say about the frequencies very near to the component ones, as they will have very large height on the graph too?

Everything I’ve seen on either transform just talks of them as natural generalisations; in particular, I gather a student is expected to blindly swallow: “the Fourier transform is just like the Fourier series with infinite period” but that creates issues! There must be something different going on; plus, there are many different conventions for the transform, all of which would yield different amplitudes for a given frequency, which is incongruous with the idea of a Fourier transform being “just like” the Fourier series.

Many thanks to anyone who can provide the rigour that is lacking in all these surface level explanations of my own half-baked thoughts and reading. It would be so nice if Wikipedia provided this kind of information, instead of just statements of fact!

Lastly (but I don’t care so much about this particular question), this is also doing my head in:

The Poisson summation formula says that the sum of Fourier series coefficients is the same as the sum of the evaluations of the transform at the component frequencies - but if my hunch about there being a key difference between the idea of Fourier series and Fourier transform is correct, this makes no sense to me!

4

There are 4 best solutions below

10
On

This may not fully answer your question but the followings are all I know about Fourier and Laplace transform, at least according to where I came from (mechanical engineering).

Background

Consider the following linear Ordinary Differential Equation (ODE):

$$ \sum_{k}{b_{k}\frac{d^{k}}{dx^{k}}y\left(x\right)}=f\left(x\right) $$

In my field I am often interested to find the particular solution of this ODE. If $f(x)$ is a linear combination of exponential function:

$$ \begin{align} \sum_{k}{b_{k}\frac{d^{k}}{dx^{k}}y\left(x\right)}&=\sum_{m}{C_{m}e^{s_{m}x}}\\ \\ y_{p}\left(x\right)&=\sum_{m}{H\left(s_{m}\right)C_{m}e^{s_{m}x}}\\ \\ H(s_{m})&=\frac{1}{\sum_{k}{b_{k}s_{m}^{k}}} \end{align} $$

Notice how easy and simple it is to find the particular solution. We just scale each exponential function by a function $H\left(s\right)$, this function is called transfer function. Therefore it is of interest to represent an arbitrary function $f(x)$ as combination of (maybe infinitely many) exponential functions.

Fourier Transform

Fourier claimed that many functions can be represented by infinitely many pure imaginary exponentials:

$$ f\left(x\right)=\frac{1}{2\pi}\int_{-\infty}^{\infty}F\left(\Omega\right)e^{i\Omega x}\phantom{x}d\Omega $$

In practice we want to know the function $F\left(\Omega\right)$ for a given $f(x)$. We do this like the following:

$$ \begin{align} \int_{-\infty}^{\infty}f(x)e^{-i\Omega x}\phantom{x}dx&=\frac{1}{2\pi}\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}F\left(\omega\right)e^{i\omega x}\phantom{.}d\omega\phantom{.} e^{-i\Omega x}\phantom{.}dx\\ \\ &=\frac{1}{2\pi}\int_{-\infty}^{\infty}F\left(\omega\right)\int_{-\infty}^{\infty}e^{i\left(\omega - \Omega\right) x}\phantom{.}dx \phantom{.}d\omega\\ \\ &=\frac{1}{2\pi}\int_{-\infty}^{\infty}F\left(\omega\right)\phantom{.}2\pi\phantom{.}\delta\left(\omega - \Omega\right)\phantom{.}d\omega\\ \\ &=F\left(\Omega\right) \end{align} $$

As you know, these two equations are Inverse Fourier Transform and Fourier Transform respectively. Notice that we have never required $f(x)$ to be periodic so in a sense there are no period nor frequency. So when you see example of Fourier Transform that is smooth continuous curve and wonder why there are no single "peak", maybe the function is simply not periodic.

I think it's schools' fault to teach Fourier Series first before Fourier Transform.

Fourier Series

In a special case in which $f(x)$ is a linear combination of pure imaginary exponential functions with period $T$, the Fourier Transform becomes a linear combination of delta functions:

$$ \begin{align} f(x)&=\sum_{m\in\mathbb{Z}}{C_{m}e^{i\frac{2\pi m}{T}x}}\\ \\ F\left(\Omega\right)&=2\pi\sum_{m\in\mathbb{Z}}{C_{m}\delta\left(\Omega-\frac{2\pi m}{T}\right)} \end{align} $$

We are not too interested in this complete result and we usually only want the coefficient $C_{m}$. We do this by using the Fourier Series expansion:

$$ C_{m}=\frac{1}{T}\int_{0}^{T}f(x)e^{-i\frac{2\pi m}{T}x}\phantom{.}dx $$

This use the fact that integration of imaginary exponential over their period returns zero as you may already know.

Conclusion

The difference between Fourier Transform and Fourier Series lies in the fact that Fourier Series is for periodic function while Fourier Transform can be applied to non periodic function as well, which leads to smooth functions rather than singular peaks.

Later on the Fourier Series will be the basic for Discrete Time Fourier Transform (DTFT) and Fast Fourier Transform (FFT).

Bonus: Laplace Transform

Recall that many (not all) function can be represented by linear combination of pure imaginary exponential functions. This is because some function cause the integral to diverge. However, notice that while the Fourier Transform for $f(x)$ does not exist, the transform for $e^{-\sigma x}f(x)$ may exist. We call the Fourier Transform of $f(x)$ after multiplication by $e^{-\sigma x}$ the Laplace Transform.

$$ \begin{align} F(\sigma, i\Omega)&=\int_{-\infty}^{\infty}e^{-\sigma x}f(x)e^{-i\Omega x}\phantom{.}dx\\ \\ &=\int_{-\infty}^{\infty}f(x)e^{-\left(\sigma+i\Omega\right) x}\phantom{.}dx\\ \\ \\ e^{-\sigma x}f(x)&=\frac{1}{2\pi}\int_{-\infty}^{\infty}F(\sigma, i\Omega)e^{i\Omega x}\phantom{.}d\Omega\\ \\ f(x)&=\frac{e^{\sigma x}}{2\pi}\int_{-\infty}^{\infty}F(\sigma, i\Omega)e^{i\Omega x}\phantom{.}d\Omega\ \end{align} $$

The Laplace Transform only exist for certain value of $\sigma$, this is called the region of convergence.

7
On

The Fourier Series represents a decomposition of $f \in L^2[0,2\pi]$ into an orthonormal basis of eigenfunctions of the linear operator $Lf=\frac{1}{i}\frac{df}{dx}=-i\frac{d}{dx}$. This operator is self-adjoint in the strictest Mathematical sense when defined on the domain $\mathcal{D}(L)$ consisting of all absolutely continuous functions $f : [0,2\pi]\rightarrow\mathbb{C}$ with derivative $f'\in L^2[0,2\pi]$. This operator has an orthonormal basis of eigenfunctions $\{ e_n(x) \}_{n=-\infty}^{\infty}$, where $e_n(x)=\frac{1}{\sqrt{2\pi}}e^{inx}$. In particular $e_n \in \mathcal{D}(L)$ for all $n\in\mathbb{Z}$ and $Le_n = n e_n$. Because of being self-adjoint in the strictness of Mathematics, these eigenfunctions form a complete orthonormal basis of $L^2[0,2\pi]$, meaning that every function $f\in L^2[0,2\pi]$ can be written as $$ f = \sum_{n=-\infty}^{\infty}\langle f,e_n\rangle e_n. $$ The convergence takes place in the norm of $L^2[0,2\pi]$, which does not guarantee pointwise convergence. However, there is a very big theorem due to Lenart Carleson that says that the convergence happens pointwise almost everywhere, which is the best that could be expected, considering that $f,g\in L^2[0,2\pi]$ have the same Fourier expansion if $f=g$ everywhere except for a set of Lebesgue measure $0$. The proof of this theorem is basically unreadable except to very well-trained Mathematicians. The $L^2$ expansion always converges in $L^2$ and also pointwise a.e., which is a result known as Carleson's Theorem: \begin{align} f &= \sum_{n=-\infty}^{\infty}\left(\int_{0}^{2\pi}f(x')\frac{e^{-inx'}}{\sqrt{2\pi}}dx'\right) \frac{e^{inx}}{\sqrt{2\pi}} \\ &= \sum_{n=-\infty}^{\infty}\left(\frac{1}{2\pi}\int_{0}^{2\pi}f(x')e^{-inx'}dx'\right) e^{inx}. \end{align} Furthermore, the following Parseval/Plancherel $L^2[0,2\pi]$ identity also holds: $$ \int_{0}^{2\pi}|f(x')|^2dx'=\|f\|^2 = \sum_{n=-\infty}^{\infty}|\langle f,\frac{e^{inx}}{\sqrt{2\pi}}\rangle|^2 = \sum_{n=-\infty}^{\infty}\left|\int_{0}^{2\pi}f(x')\frac{e^{-inx'}}{\sqrt{2\pi}}dx'\right|^2 $$ The $L^2[0,2\pi]$ norm convergence of the Fourier series does not imply Carleson's Theorem; this norm convergence is much easier to prove. However, it is not so difficult to prove that there is a subsequence that converges pointwise a.e..

The $L^2(\mathbb{R})$ case of $Lf = \frac{1}{i}\frac{d}{dx}$ is trickier than the case on the finite interval $[0,2\pi]$ because $L=\frac{1}{i}\frac{d}{dx}$ does not have eigenfunctions or eigenvalues!! This is because the functions that would be eigenfunctions--namely $e^{i\lambda x}$ for $\lambda\in\mathbb{R}$--are not in $L^2(\mathbb{R})$. However, we may talk about the functions $e^{i\lambda x}$ for $\lambda\in\mathbb{R}$, even though they are not vectors in the underlying vector space $L^2(\mathbb{R})$. They are classical eigenfunctions, but they don't exist in the space $L^2(\mathbb{R})$ where $L=\frac{1}{i}\frac{d}{dx}$ is defined. However, we do have approximate eigenvectors in the $L^2(\mathbb{R})$ sense. For example, $$ e_{\lambda,\delta}(x)=\frac{1}{\sqrt{2\pi}}\int_{\lambda-\delta}^{\lambda+\delta}e^{i\mu x}d\mu $$ is in $L^2(\mathbb{R})$, and it is an approximate eigenvector of $L$ with approximate eigenvalue $\lambda$. To be more specific, note that the Plancherel Theorem gives $$ \|e_{\lambda,\delta}\|^2= \int_{\lambda-\delta}^{\lambda+\delta}|e^{i\mu\lambda}|^2d\mu =2\delta, $$ and $$ \|Le_{\lambda,\delta}-\lambda e_{\lambda,\delta}\|^2 = \left\|\frac{1}{\sqrt{2\pi}}\int_{\lambda-\delta}^{\lambda+\delta}(\mu-\lambda)e^{i\mu\lambda}d\mu\right\|^2 \\ =\int_{\lambda-\delta}^{\lambda+\delta}(\mu-\lambda)^2d\mu = \left. \frac{(\mu-\lambda)^3}{3}\right|_{\mu=\lambda-\delta}^{\lambda+\delta}=\frac{2\delta^3}{3} $$ So $e_{\lambda,\delta}$ is an approximate eigenvector sequence with approximate eigenvalue $\lambda$ as $\delta\rightarrow 0$ because $$ \|(L-\lambda I)e_{\lambda,\delta}\|^2 \le \frac{\delta^2}{3}\|e_{\lambda,\delta}\|^2. $$ Basically, all self-adjoint operators work in the same way as these two examples. Self-adjoint operators may have discrete and or continuous spectrum, and these two examples of differentiation are typical of what happens.

6
On

You already have some nice answers. I just wanted to comment on what you wrote here:

But the graph of Fourier transform isn’t a Dirac-Delta-esque impulse spike; it is a smooth function (in the graphs I’ve seen), which suggests that the frequencies that differ ever so slightly from the “key” frequencies of the function are still present in some sense, are still shown to have some amplitude

and here:

None of this explains either the reality that the graph of the transform still has a height in the neighborhood of some component frequency, suggesting the frequencies near to it are present... how does the transform capture this? And why is it the case? Indeed, is the transform equal to zero when evaluated at a non-component frequency, or is it just very close to zero? I’ve never been able to tell by eyeballing graphs.

I do not know what particular "graphs" you have looked at. However, I have some idea of what the problem might be. I know that you want a rigorous answer, and I will try to do my best. However, I suspect that the problem with these graphs is rather a "practical" artifact as I will explain in a moment.

As other answered, the Fourier transform of a periodic function: $$ f(x)=\sum_{m\in\mathbb{Z}}{C_{m}e^{i\frac{2\pi m}{T}x}}\\ $$ is $$ F\left(\omega\right)=2\pi\sum_{m\in\mathbb{Z}}{C_{m}\delta\left(\omega-\frac{2\pi m}{T}\right)} $$ which is a Dirac-Delta-esque impulse spike "train", and not a smooth function.

However, in practice even for periodic functions and depending on the reference you use, the "graph" of the Fourier transform is shown to have a smooth behavior. For example, a single tone $f(x) = e^{i\omega_0 x}$ should have a single delta spike centered at $\omega_0$ as the Fourier transform, but you might see its "Fourier transform" represented as:

enter image description here

and this happens more often if you are reading a book (or watching a video) oriented more to the practical side (e.g. signal processing).

What is happening, is that for practical "signals" (functions) such as audio, or video the Fourier transform is really computed using the Fast-Fourier-Transform (FFT) over a "window" (interval) only, instead of the whole $\mathbb{R}$. This is so since the FFT is a numerical algorithm that cannot be applied to an infinite length input without ever finishing the computation.

Hence, instead of computing the Fourier transform of the original $f(x)$, you are essentially looking at the Fourier transform of $$ \tilde{f}(x) = f(x)\Pi(x) $$ where $\Pi(x) = 1$ if $x\in[-L/2,L/2]$ and $\Pi(x)=0$ otherwise, with $[-L/2,L/2]$ the interval of interest. Moreover, as I will discuss later, the approximation $\tilde{f}(x)$ of $f(x)$ gets better as the window length $L\to\infty$. Now, if you compute the Fourier transform of such product you obtain a convolution: $$ \mathcal{F}\{\tilde{f}(x)\} = \mathcal{F}\{f(x)\Pi(x)\} = \frac{1}{2\pi}\mathcal{F}\{f(x)\}\ast \mathcal{F}\{\Pi(x)\} $$ Where $$ \mathcal{F}\{\Pi(x)\} = \int_{-\infty}^{\infty} \Pi(x) e^{i\omega x}dx = \int_{-L/2}^{L/2} e^{-i\omega x}dx = \frac{1}{i\omega}\left[e^{i\omega L/2}-e^{-i\omega L/2}\right] = L \frac{\sin((\omega/2)L)}{L(\omega/2)} = L\text{sinc}(L(\omega/2)) $$ using the (non normalized) sinc function $\text{sinc}(x) = \sin(x)/x$ which looks like a pulse as in the graph I have before. Now, if $f(x)$ was a single tone at frequency $\omega_0$, thus $\mathcal{F}\{f(x)\} = 2\pi\delta(\omega-\omega_0)$. Hence, $$ \frac{1}{2\pi}\mathcal{F}\{f(x)\}\ast \mathcal{F}\{\Pi(x)\} = \delta(\omega-\omega_0)\ast L\text{sinc}(L(\omega/2)) = L\text{sinc}(L((\omega-\omega_0)/2)) $$

Thus, instead of $2\pi\delta(\omega-\omega_0)$ you obtain $L\text{sinc}\left(L\frac{\omega-\omega_0}{2}\right)$. Interestingly, it can be shown that $$ \lim_{L\to \infty} L\text{sinc}\left(L\frac{\omega-\omega_0}{2}\right) = 2\pi\delta(\omega-\omega_0) $$ in the distribution sense (see here) as expected.

In summary:

In practice, many authors tend to depict the "Fourier Transform" of signals using the FFT of the "windowed" version of the signals. The result is that delta spikes are substituted by sinc pulses. This might be the reason why you see that the Fourier transform contribution is non-zero in the neighborhood where you expected the true delta spike.


Now, regarding Laplace transform

It is well know that the Laplace transform $\mathcal{L}\{f(x)\}(s)$ evaluated at $s=i\omega$ leads to the Fourier transform. Well... this is not entirely correct. This is only true for the double sided Laplace transform: $$ \mathcal{L}_{ds}\{f(x)\}(s) = \int_{-\infty}^{\infty} f(x)e^{-st}dt $$ However, in most references, you use the single sided Laplace transform: $$ \mathcal{L}_{ss}\{f(x)\}(s) = \int_{0}^{\infty} f(x)e^{-st}dt $$ What is the relation between the two? The single-sided transform is the double sided transform applied to the signal $$ \tilde{f}(x) = f(x)u(x) $$ instead, where $u(x)=1$ for $x\geq 1$ and $u(x)=0$ otherwise. Similarly as in the discussion we had before for the Fourier transform, the Laplace tranform of $\tilde{f}(x)$ is given as a convolution: $$ \mathcal{L}_{ss}\{\tilde{f}(x)\} = \mathcal{L}\{f(x)\}_{ds}\ast \mathcal{L}_{ds}\{u(x)\} $$ Now comes the interesting part: $\mathcal{L}_{ds}\{u(x)\} = \frac{1}{s}$, a pole!

Hence, similarly as before, when evaluating $s=i\omega$ instead of $\delta(\omega-\omega_0)$ functions, you will obtain poles $\frac{1}{i(\omega-\omega_0)}$ by the convolution above (Perhaps I am missing a multiplicative factor somewhere).

In summary:

Using the unilateral Laplace transform is using the bi-lateral transform but through a similar window process as we did for the Fourier transform. However, in this case, the window is the step $u(x)$ which leads to a pole $1/s$ instead of a sinc function. This is why you don't see sharp spikes either in this case.

As a last note. The double sided Laplace transform of a single tone won't coverge unless $s=i\omega$, which is the Fourier transform. So everything should be ok. So, when using the Laplace transform, dont forget to take the region of convergence into account.

4
On

This is an interesting question.

Let's define a signal with unit amplitude $s(t)=exp(j2\pi f_1 t)$, where $f$ is the frequency of that signal, and $t\in [0,T_{total}) $ is just the time variable. If you want to compute the Continuous Fourier Transform of this signal, you can plug it into the integral, and you will obtain what @FeedbackLooper explained

Another interesting thing (in my opinion) happens when you compute the Discrete Fourier Transform (DFT). To do that, you sample the time as follows $t=\frac{n}/{f_s}, n=0,...,N-1$, and where $f_s$ is the sampling frequency. Notice that, we took only $N$ samples from the signal, and each sample is $T = \frac{1}{f_s}$ seconds apart from each other. Moreover, we put a time window on the observation of the signal i.e., we observed this signal for the duration of $NT$. Let's keep this in mind, and sample this signal

$$ s[n] = exp(j2\pi \frac{f_1}{f_s} n) rect[n] = exp(j2\pi \Delta_1 n) rect[n] $$ where $rect[n]$ is the rectangle function i.e., the time window of which we observed the signal.

So here is the first important thing: the frequency of the signal is divided by the sampling frequency of our system. This will be important later on! Now, let's compute the DFT of this signal

$$ S[k] = \sum_{n=0}^{N-1} s[n] rect(n) exp(-j2\pi k n), k=0,...,N-1 $$ yielding (I skip some parts)

$$ S[k] = sinc(k-\Delta_1) $$

Now, this is probably the most important part of it. Notice that $k$ is our frequency grid with integer numbers i.e., the points on which our system can have values. The separation between each of those points is given by $\delta_k=\frac{1}{NT}$, which is also known as frequency resolution. Now, there are two scenarios.

  1. $\Delta_1$ is an integer number i.e., $f_1$ is an integer multiple of $f_s$. In this case, the sinc function will be shifted such that its maximum point will be centered at $k=\frac{f_1}{f_s}$, and the nulls of the sinc will be located on the remaining $k$ values. In this case, your system will 'observe' a Dirac-Delta, while the real underlying signal has a sinc shape.

  2. $\Delta_1$ is not an integer number i.e., $f_1$ is not an integer multiple of $f_s$. In this case, the sinc function will be shifted such that its maximum is somewhere between the true $k$ value and $k-1$ or $k+1$. In this case, the maximum of the sinc and its nulls are not located on any $k$. Thus, the signal response leaks towards all the frequency bins.

By looking at the image below, you can see what I explained. In this particular case, the frequency resolution is 8Hz. The first image shows the spectrum of a signal whose frequency is 32 Hz, and the second image shows the spectrum of a signal whose frequency is 30 Hz. Moreover, the vertical black lines correspond to the spectral grid that I mentioned earlier. You can notice that in the first image, the sinc function is sampled at its maximum and its nulls (in red), which then looks like a Dirac Delta (in black). However, in the second image, the sinc function is sampled at other parts other than its maximum and nulls.

I hope this puts things in perspective.

Cheers.

fft and sincs