How to understand multi-resolution analysis and wavelet transform?

370 Views Asked by At

I just started learning multi-resolution analysis. I know that given a scaling function, through dilation and translation, a sequence of spaces can be generated: $\cdots V_{-1} \subset V_0 \subset V_1 \cdots$ and their union is dense in $L^2(\mathbb{R^n})$. Also $V_i$ can be decomposed as $V_{i-1} \oplus W_{i-1}$, where $W_{i-1}$ is generated by wavelet functions. My question is, since we can approximate functions in $L^2(\mathbb{R^n})$ by scaling functions, why do we still need wavelet functions? And why do we need the decomposition $V_i = V_{i-1} \oplus W_{i-1}$?

1

There are 1 best solutions below

5
On BEST ANSWER

Late to the party :)

This is an excellent question. As you'll understand the answer, you'll realize why wavelets are such a powerful thing.

It is true that scaling functions provide a way to approximate all functions in $L^2(\mathbb R^n)$. The same can be said about decomposing using Wavelets themselves. So if both scaling function and wavelet can be used to approximate a function, what makes wavelets more interesting?

The key is that wavelets have nice compression properties, whereas scaling functions don't. Compression means that if you decompose a smooth function with wavelets, most of the coefficients will be small. This is sometimes referred to as a sparse representation. This has important applications in, well, signal compression (obviously), but also denoising, machine learning etc.

The gory details:

Consider a smooth function $f$ (say $f\in \mathcal C^{\infty}$). Now, decompose it using scaling functions $$f(t)=\sum_{n\in\mathbb Z}a_n \phi\left(\frac {t-nT}T\right)$$ In that decomposition, there is no reason to expect that most of the $a_n$ coefficients will be small. Indeed, think of each translated/rescaled version of the scaling function as a bump (e.g. a Gaussian). Then coefficient $a_n$ will therefore be a rough approximation of $f(nT)$. That approximation becomes exact as $T\rightarrow 0$. So the smoothness of the original signal has no impact on the size of the coefficients. Instead, the coefficients scale as the local magnitude of $f$. $$\boxed{\text{Decomposing }f\text{ using scaling functions does not lead to a sparse representation.}}$$

Now consider using wavelets instead. Wavelets have zero (a.k.a. vanishing) moments, that is $$\int_{\mathbb R}\psi(x)x^kdx=0 \text{ for }k=0,..., K\tag{1}$$ where $K\geq 0$. The exact value of $K$ depends on the family of wavelets you're considering.

That zero-moment property has a very important consequence. It can be shown (outside the scope of my answer, please look at textbooks, or Stephane Mallat / Yves Meyer's papers) that a smooth function's wavelet coefficients will decay rapidly (this can be quantified). In fact, it's fairly easy to realize that $(1)$ means that the wavelet coefficients of any polynomial of degree $\leq K$ will be equal to $0$. From polynomials to smooth functions, you can use Taylor series approximations. So the smoother the function, the smaller most of its wavelet coefficients.

$$\boxed{\text{Decomposing }f\text{ using wavelet functions leads to a sparse representation.}}$$

The conclusion is that smooth functions are not only well approximated with wavelets, their representations are sparse.

Finally, you might say: Wait a minute! I already had that with Fourier series! I can take a smooth function that's in $L^2([0, 1])$ and represent it using Fourier series. The Fourier coefficients will decay rapidly if the function is smooth (Riemann-Lebesgue lemma), leading to a sparse representation. So why do we need wavelets?

Well, wavelets hold a major advantage over the Fourier basis: They are localized in both time and frequency. The Fourier basis is only localized in frequency. A consequence is that the Fourier representation is very sensitive to local perturbations whereas the Wavelet one isn't. Concretely, if you take a smooth signal, and add just one point of discontinuity, then the all Fourier coefficients will be affected and you'll lose the sparse representation. With wavelets, only a few number of coefficients will be affected. This makes it a great fit for many tasks, such as image compression (where you have smooth areas with a few edges).

Finally, I'll let you ponder over the following (ignore if that doesn't make too much sense): Using Fourier to represent a signal is like using scaling functions to represent the Fourier transform of that signal (that is, the signal in the frequency domain). Neither is particularly efficient. Wavelets strike just the right compromise in terms of compressing in both time and frequency.