This page is about wavelets that constitute a further development of central ideas of the Fourier transform. Wavelets are important in numerical mathematics, statistical analysis and simulations, especially in the Earth sciences.
On this page we will first explain why wavelets were developed as a substitute of the Fourier transform for specific applications.
As is the case with the Fourier transform, there is both a continuous theory of functions of the real numbers and a discrete theory of finite and infinite series. We will first define the continuous transformation and point out similarities and differences to the Fourier transform. Then we will develop the discrete wavelet transform, and finally discuss some concrete algorithms and applications.
Heisenberg’s uncertainty relation says that one cannot know both the location and the impulse of an elementary quantum particle: If we know the location exactly (the wave function is a delta function in position space), we don’t know anything about the impulse (the wave function is multiple of $e^{i x p}$ in impulse space). Vice versa, if we know the impulse exactly, we don’t know anything about the location.
The Fourier transform of a function in position space is in a generalized sense a coefficient expansion in a basis of functions $e^{i \omega t}$ that are optimally localized in impulse space and maximal not localized in position space.
Wavelets on the other hand are orthonormal bases or, more general, frames of the Hilbert space of wave functions such that if we learn about the coefficients of a wave function with respect to this basis, we learn simultaneously about the location and the impulse of the particle described by the wave function. That is, wavelets are constructed in a way that they are both localized in position and in impulse space. This makes them often more useful in practical problems than the Fourier transform.
“Dual representations” such as a Fourier transform or wavelet transform are generally used to map a complete “function” (e.g., set of (time,value) pairs over a time interval) to an alternate representation, often with little knowledge of how the function is formed. Mostly, although not always, the advantage of using a dual representation is that it converts a dense (i.e., not sparse) representation into a representation which is either
exactly sparse, or
effectively sparse, in the sense that many small coefficients can be zeroed with negligable deviation of the overall signal from the original.
This sparsity is useful both for:
Dramatically reducing the amount of computation in actual problems, eg, allowing sparse matrix solvers to be used.
Restricting the number of non-zero wavelet coefficients can be used as a mechanism to prevent overfitting.
Whilst Fourier representations are sparse for certain kinds of problems, many of the problems occurring in practical application are sparse when using a wavelet representation. This is often related to the simultaneous localisation in time and space refered to above.
Imagine listening to a symphony. The representation of the signal in time is e.g. the air pressure at your ears. The Fourier transform of the signal is the representation in frequency space that specifies the amount that each frequency contributes to the whole symphony. Both representations are obviously not very useful to humans :-)
The representation of the signal via notes is a wavelet representation, every note represents a contribution of a signal that is both localized in time and in frequency.
We will first define several concepts of the continuous wavelet transform before we concentrate on the discrete transform and its applications.
The following definition of a wavelet may appear rather technical right know, but we will explain the motivation behind it further down.
Wavelet
A function $\psi \in L^2(\mathbb{R})$ and unit norm $\|\psi\| = 1$ is called a wavelet if it fullfills the following consistency condition
where $\widehat{\psi}$ denotes the Fourier transform.
We will often write
Now that we know what a wavelet is, we are ready to define the wavelet transform:
Continuous Wavelet Transform
Let $\psi \in L^2(\mathbb{R})$ be a wavelet. The continuous wavelet transform Wf of a function $\f \in L^2(\mathbb{R})$ is the following function:
The numbers $Wf(a, b)$ are called the wavelet coefficients of f.
A wavelet coefficient of a function is supposed to tell us about the mean of this function with respect to a certain time and frequency interval (“interval” in an approximate sense). In order to be a mean value, the wavelet itself should put equal weight to the positive and the negative part of the sample function. The next proposition tells us that the consistency condition implies this, if we add an additional assumption.
Wavelet Mean is Zero Let $\psi$ be a wavelet. If in addition $t \psi \in L^1{\mathbb{R}}$, then we have
Just like the Fourier transform is invertible, there is an inversion formula for the wavelet transform too. It was discovered by A.P. Calderón in 1964 and rediscovered by A.Grossmann and J.Morlet in 1984:
Inversion Formula (Calderón , Grossmann-Morlet )
Let $\psi$ be a wavelet and in addition $\psi \in L^1(\mathbb{R})$. Then for every function $\f \in L^2(\mathbb{R})$ we have the following inversion formula of the continuous wavelet transform:
in $L^2{\mathbb{R}}$. To be more precise, the right hand side defines for every $\epsilon \gt 0$ a function $\in L^2(\mathbb{R})$ which converges to f in $L^2(\mathbb{R})$ as $\epsilon \to 0$.
This “full fledged” inversion formula may look a little bit complicated, and its proof certainly is. If we add some “nicety” assumptions, we get an inversion formula that is considerably easier to prove, and guarantees point-wise convergence instead of convergence in $L^2$, which is more interesting for many applications:
Simplified Inversion Formula
Let $\psi$ be a wavelet. Let $f \in L^2(\mathbb{R})$ be continuous and both $f \in L^1(\mathbb{R})$ and $\hat{f} \in L^1(\mathbb{R})$. Then for every $t \in \mathbb{R}$ we have:
Note that the convergence is pointwise for every $t$.
The wavelet coefficients are highly redundant, we can expect that we won’t need them all to reconstruct a function f from the coefficients. The fact that the coefficients depend on two parameters, while the original function depends on one parameter may be taken as a hint into this direction (although we will not try to make this argument rigorous for good reasons :-).
The development of the discrete wavelet transform therefore starts with the observation that it may be enough to specify a countable set of points $P := \{a_{i, j}, b_{i_j}\}$ such that the original function f can be reconstructed from the wavelet coefficients $Wf(a_{i, j}, b_{i_j})$ alone.
Since we have a scaling parameter $a$ and a translation parameter $b$, it is natural to define the values of $a$ to be multiples of a given constant $\sigma \gt 0$, and $b$ to run through multiples of a constant offset.
Discrete Wavelet Transform (DWT), first half
Let $\psi \in L^2(\mathbb{R})$ be a wavelet. Let $\sigma \gt 0$ and $\tau \gt 0$ and define
In this context $\psi$ is often called the mother wavelet and the functions $\psi_{j, k}$ are called child wavelets.
The discrete wavelet transform $W_{\sigma, \tau}f$ of a function $\f \in L^2(\mathbb{R})$ is defined as follows:
This definition is only the first half, because we have to add conditions on $(\psi, \sigma, \tau)$ that makes the transformation useful in a sense we will explain now:
we’d like that the DWT is defined for all $\f \in L^2(\mathbb{R})$,
we’d like that the DWT is a continuous operator, such that small errors in the initial data are not amplified uncontrollably ,
we’d like the DWT to hava a continuous inverse, so that the inverse operation, the reconstruction of a signal from it’s DWT coefficients, does not amplify any error in the coefficients uncontrollably.
From elementary operator theory we see that the child wavelets should form a frame:
Frame and second half of DWT
A frame in a separable Hilbert space is a countable set of vectors $\{ \phi_j \}$ such that there are constants $c_1, c_2 \gt 0$ such that for every vector $f$ we have
The term frame is therefore a generalization of orthonormal base of a separable Hilber space.
A triple $(\psi, \sigma, \tau)$ generates a DWT if the child wavelets form a frame in $L^2(\mathbb{R})$.
Unlike the Fourier transform there is, strictly speaking, not “one” wavelet transform, since there are many functions that we would call “wavelet” according to our definition. Different wavelets give different wavelet transforms. There are several criteria of what “goodness” of a wavelet is, depending on the applications one has in mind, and several approaches to constructing new wavelets.
The most popular one is the so-called multiresolution analysis.
The simplest example is the Haar wavelet, named after a student of David Hilbert:
A more sophisticated family of wavelets are Daubechies wavelets.
There are a lot of textbooks about wavelets so any list of references is biased.
Stéphane Mallat: A wavelet tour of signal processing. The sparse way. 3rd ed. (ZMATH)
Yves Meyer: Wavelets and operators. (ZMATH)
Stéphane Jaffard, Yves Meyer and Robert D. Ryan: Wavelets. Tools for science and technology. (ZMATH)
Karsten Urban: Wavelets in Numerical Simulation (ZMATH)
Wavelets also play a role in physics:
Syed Twareque Ali, Jean-Pierre Antoine and Jean-Pierre Gazeau, Coherent states, wavelets and their generalizations. (ZMATH)
G. Battle, Wavelets and Renormalization. (ZMATH)