1. Heisenberg Uncertainty Principle
The Heisenberg Uncertainty principle states that a signal cannot be both highly concentrated in time and highly concentrated in frequency. For example, consider a square-integrable function normalised so that . In this case, defines a probability distributions on the real line. The Fourier isometry shows that its Fourier transform also defines a probability distribution . In order to measure how spread-out these two probability distributions are, one can use their variance. The uncertainty principle states that
where designates the variance of the distribution . This shows for example that and cannot be simultaneously less than .
We can play the same with a vector and its (discrete) Fourier transform . The Fourier matrix is defined by
where the normalisation coefficient ensures that . The Fourier inversion formula reads . To measure how spread-out the coefficients of a vector are, one can look at the size of its support
If an uncertainty principle holds, one should be able to bound from below. There is no universal constant such that
for any and . Indeed, if and one readily checks that and so that . Nevertheless, in this case this gives . Indeed, a beautiful result of L. Donoho and B. Stark shows that
Maybe more surprisingly, and crucial to applications, they even show that this principle can be made robust by taking noise and approximations into account. This is described in the next section.
2. Robust Uncertainty Principle
Consider a subset of indices and the orthogonal projection on . In other words, if , then
We say that a vector is -concentrated on if . The -robust uncertainty principle states that if is concentrated on and is -concentrated on then
Indeed, the case gives Equation (5). The proof is surprisingly easy. On introduces the reconstruction operator defined by
In words: take a vector, delete the coordinate outside , take the Fourier transform, delete the coordinate outside , finally take the inverse Fourier transform. The special case simply gives . The proof consists in bounding the operator norm of from above and below. The existence of a vector that is -concentrated such that is -concentrated gives a lower bound. In details:
- Upper bound: it is obvious that is a contraction in the sense that since are isometries and are orthogonal projections. Moreover, the smaller and , the smaller the norm of . Since is an isometry we have , where the supremum is over all that are supported by . Cauchy-Schwarz shows that
In other words, the reconstruction operator satisfies . This is a non-trivial bound only if .
- Lower bound: consider that satisfies and . The Fourier transform is an isometry so that the triangle inequality easily shows that in the sense that
This proves the lower bound .
In summary, the reconstruction operator always satisfies . Moreover, the existence of satisfying and implies the lower bound . Therefore, the sets and must verify the uncertainty principle
Notice that we have only use the fact that the entries of the Fourier matrix are bounded in absolute value by . We could have use for example any other unitary matrix . The bound for all gives the uncertainty principle . In general since is an isometry.
One can work out an upper bound and a lower bound for the reconstruction operator using the -norm instead of the -norm. Doing so, one can prove that if and then
3. Stable Reconstruction
Let us see how the uncertainty principle can be used to reconstruct a corrupted signal. For example, consider a discrete signal corrupted by some noise . In top of that, let us suppose that on the set the receiver does not receive any information. In other words, the receiver can only observe
In general, it is impossible to recover the original , or an approximation of it, since the data for are lost forever. In general, even if the received signal is very weak, the deleted data might be huge. Nevertheless, under the assumption that the frequencies of are supported by a set satisfying , the reconstruction becomes possible. It is not very surprising since the hypothesis implies that the signal is sufficiently smooth so that the knowledge of on is enough to construct an approximation of for . We assume that the set is known to the observer. Since the components on are not observed, one can suppose without loss of generality that there is no noise on these components ( we have for ): the observation can be described as
It is possible to have a stable reconstruction of the original signal if the operator can be inverted: there exists a linear operator such that for every . If this is the case, the reconstruction
satisfies so that . A sufficient condition for to exist is that for every . This is equivalent to
Since for we have it follows that where is the reconstruction operator. The condition thus follows from the bound . Consequently the operator is invertible since satisfies
Since we have the bound . Therefore
Notice also that can easily and quickly be approximated using the expansion
Nevertheless, there are two things that are not very satisfying:
- the bound is extremely restrictive. For example, if of the component are corrupted, this imposes that the signal has only (and not ) non-zero Fourier coefficients.
- in general, the sets and are not known by the receiver.
The theory of compressed sensing addresses these two questions. More on that in forthcoming posts, hopefully … Emmanuel Candes recently gave extremely interesting lectures on compressed sensing.
4. Exact Reconstruction: Logan Phenomenon
Let us consider a slightly different situation. A small fraction of the components of a signal are corrupted. The noise is not assumed to be small. More precisely, suppose that one observes
where is the set of corrupted components and is an arbitrary noise that can potentially have very high intensity. Moreover, the set is unknown to the receiver. In general, it is indeed impossible to recover the original signal . Surprisingly, if one assumes that the signal is band-limited the Fourier transform of is supported by a known set that satisfies
it is possible to recover exactly the original . The main argument is that the condition (20) implies that the signal has more that of its energy on in the sense that . Consequently, since is the set where the signal is perfectly known, enough information is available to recover the original signal . To prove the inequality , it suffices to check that
In this case we have , which prove that has more that of its energy on . Indeed, the same conclusion holds with the -norm if the condition holds. The magic of the -norm is that condition implies that the signal is solution of the optimisation problem
This would not be true if one were using the norm instead. This idea was first discovered in a slightly different context in Logan’s thesis (1965). More details here. To prove (22), notice that any can be written as with . Therefore, since , it suffices to show that
This is equivalent to proving that for any non-zero we have
It is now that the -norm is crucial. Indeed, we have
which gives the conclusion. This would not work for any -norm with since in general : the inequality is in the wrong sense. In summary, as soon as the condition is satisfied, one can recover the original signal by solving the minimisation problem (22).
\noindent Because it is not hard to check that in general we always have
this shows that the condition implies that exact recovery is possible! Nevertheless, the condition is far from satisfying. The inequality is tight but in practice it happens very often that even if is much bigger that .
The take away message might be that perfect recovery is often possible if the signal has a sparse representation in an appropriate basis (in the example above, the usual Fourier Basis): the signal can then be recovered by -minimisation. For this to be possible, the representation basis (here the Fourier basis) must be incoherent with the observation basis in the sense that the coefficients of the change of basis matrix (here, Fourier matrix) must be small if is the observation basis and is the representation basis then must be small for every . For example, for the Fourier basis we have for every .