While understanding difference between wavelets and Fourier transform I came across this point in Wikipedia.
The main difference is that wavelets are localized in both time and frequency whereas the standard Fourier transform is only localized in frequency.
I did not understand what is meant here by "localized in time and frequency."
Can someone please explain what does this mean?
Very roughly speaking: you can think of the difference in terms of the Heisenberg Uncertainty Principle, one version of which says that "bandwidth" (frequency spread) and "duration" (temporal spread) cannot be both made arbitrarily small.
The classical Fourier transform of a function allows you to make a measurement with 0 bandwidth: the evaluation $\hat{f}(k)$ tells us precisely the size of the component of frequency $k$. But by doing so you lose all control on spatial duration: you do not know when in time the signal is sounded. This is the limiting case of the Uncertainty Principle: absolute precision on frequency and zero control on temporal spread. (Whereas the original signal, when measured at a fixed time, gives you only absolute precision on the amplitude at that fixed time, but zero information about the frequency spectrum of the signal, and represents the other extreme of the Uncertainty Principle.)
The wavelet transform take advantage of the intermediate cases of the Uncertainty Principle. Each wavelet measurement (the wavelet transform corresponding to a fixed parameter) tells you something about the temporal extent of the signal, as well as something about the frequency spectrum of the signal. That is to say, from the parameter $w$ (which is the analogue of the frequency parameter $k$ for the Fourier transform), we can derive a characteristic frequency $k(w)$ and a characteristic time $t(w)$, and say that our initial function includes a signal of "roughly frequency $k(w)$" that happened at "roughly time $t(w)$".
How is this helpful? Let us say we are looking at the signal of the light emitted from a traffic light. So for some time it will be red, and for some time it will be green (ignore the yellow for now). If we take the Fourier transform of the observed frequency, we can say that
But a functioning traffic light would have either red or green shown at a time, and not both. And if the traffic light malfunctions and shows both lights at the same time, we would still see from the Fourier transform
But if we take the wavelet transform we can sacrifice frequency precision to gain temporal information. So with the wavelet transform done on the working traffic light we may see
This would tell us that not only can the traffic light show both red and green lights, that at least at around 1 o'clock the light is working properly and only showing one light.