SMC sampler

Sequential Monte Carlo (SMC) methods are usually designed to deal with probability distributions of increasing dimensions. The canonical example is the sequence of distributions {\pi_n(X_{1:n}) = \mathbb{P}(X_{1:n} \, | Y_{1:n})} where {\{ X_k \}_{k \geq 0}} is a hidden Markov chain and {\{ Y_k \}_{k \geq 0}} is an observation process. The notation {X_{1:n}} is a shortcut for {(X_1, X_2, \ldots, X_n)}. This is usually called a Hidden Markov Model in the literature. If the Markov chain {X_k} evolves on in {\mathbb{R}^d} then the distribution {\pi_n} is a probability on {\big( \mathbb{R}^d \big)^n}.

It turns out that one can use the same kind of ideas to study a single distribution {\pi}. If {\pi} is difficult to sample directly from and that it is difficult to use traditional MCMC methods (because of the dimensionality, multi-modality, poor mixing properties, etc…), one can try to use SMC methods to study the distribution {\pi}. To this end, one can introduce a sequence of bridging distributions

\displaystyle \begin{array}{rcl} \pi_0, \pi_1, \ldots, \pi_n \end{array}

where {\pi_0} is typically very easy to sample from and {\pi_n=\pi}. A traditional choice is {\pi_k(x) \propto \pi(x)^{\alpha_k}} where {0 \leq \alpha_1 < \alpha_2 < \ldots <\alpha_n=1}. The bigger the index {k}, the more intricate the distribution {\pi_k} is. The problem is that each distribution {\pi_k} has the same dimensionality as the original distribution {\pi} so that it is difficult to use all the machinery of SMC methods. For the sake of concreteness, let us suppose that {\pi} is a distribution on {\mathbb{R}^d}. A fundamental remark that is the base of what is now known as Sequential Monte Carlo samplers is that {\pi_k(x_k)} can be seen as the marginal distribution of the probability

\displaystyle \begin{array}{rcl} \Pi_k(x_1, \ldots, x_k) = \pi_k(x_k) L_k(x_k, x_{k-1})L_{k-1}(x_{k-1}, x_{k-2}) \ldots L_2(x_2,x_1) \end{array}

where {L_j(\cdot, \cdot)} are Markov kernels in the sense that for any {x} we have {\int_y L_j(x, y) \, dy = 1}. The distribution {\Pi_k} lives on {\big( \mathbb{R}^d \big)^k} and one can use the traditional SMC methods to explore the sequence of distributions {\Pi_1, \ldots, \Pi_n}. This algorithm was devised a few years ago by Pierre Del Moral, Arnaud Doucet and Ajay Jasra. It is extremely powerful and several other algorithms can be seen as particular cases of it. Because I had never implemented the algorithm before, I tried yesterday to see how this worked on a very simple example. My target distribution was an equal mixture of seven Gaussian distributions in {\mathbb{R}^2} with unit covariance matrices and centred at the vertices of a regular heptagon with radius {20}. I started with {N=3000} particles at the centre of the heptagon and let them evolved according to a random walk. When the Effective Sampling Size (ESS) was too small (less than {N/2}), the whole particle system was regenerated. In order to obtain a sweet movie, I used {100} bridge distributions {\pi_k \propto \pi^{k/100}}.

The result is clear and quite impressive. Indeed, this could have been made much more efficient by taking much less bridge distributions. Some recent theory has been developed to analyse the stability of the algorithm in high dimensional situations. The analysis described in the article uses a scaling limit approach very similar to some ideas used to analyse MCMC algorithms and some more recent developements along the same lines that we have done with Andrew and Natesh. Very exciting stuff!

[EDIT: 2 October 2012]
I received several emails asking me to describe how to obtain the short animation showing the evolution of the particles system. The code is available here. This is a quick python script producing a bunch of .png files. It then suffices to stick them together to obtain an animated .gif file. This can for example easily be done using the free (and very good) photoshop-like software GIMP. Open all the png files as different layers. Export the result as a .gif file with the option ‘animated gif’ checked. Voila!