A brilliant talk by Persi Diaconis !
Is this random ? Card shuffling, coins and their friends …
June 23, 2010 at 5:57 pm (probability, random permutation)
Tags: card shuffling, coins, persi diaconis, poker, randomness
Potts model and Monte Carlo Slow Down
June 22, 2010 at 12:24 am (markov chain, Monte Carlo, probability)
Tags: Ising, MCMC, Metropolis, Monte Carlo, Potts model
A simple model of interacting particles
The mean field Potts model is extremely simple: there are interacting particles
and each one of them can be in
different states
. Define the Hamiltonian
where and
is the Kronecker symbol. The normalization
ensures that the energy is an extensive quantity so that the mean energy per particle
does no degenerate to
or
for large values of
. The sign minus is here to favorize configurations that have a lot of particles in the same state. The Boltzman distribution at inverse temperature
on
is given by
where is a normalization constant. Notice that if we choose a configuration uniformly at random in
, with overwhelming probability the ratio of particles in state
will be close to
. Also it is obvious that if we define
then will be close to
for a configuration taken uniformly at random. Stirling formula even says that the probability that
is close to
is close to
where
Indeed . The situation is quite different under the Boltzman distribution since it favorizes the configurations that have a lot of particles in the same state: this is because the Hamiltonian
is minimized for configurations that have all the particles in the same state. In short there is a competition between the entropy (there are a lot of configurations close to the ratio
) and the energy that favorizes the configurations where all the particles are in the same state.
With a little more work, one can show that there is a critical inverse temperature such that:
- for
the entropy wins the battle: the most probable configurations are close to the ratio
- for
the energy effect shows up: there are
most probable configurations that are the permutations of
where
and
are computable quantities.
The point is that above the system has more than one stable equilibrium point. Maybe more important, if we compute the energy of these most probable states
then this function has a discontinuity at . I will try to show in the weeks to come how this behaviour can dramatically slow down usual Monte-Carlo approach to the study of these kind of models.
Hugo Touchette has a very nice review of statistical physics that I like a lot and a good survey of the Potts model. Also T. Tao has a very nice exposition of related models. The blog of Georg von Hippel is dedicated to similar models on lattices, which are far more complex that this mean field approximation presented here.
MCMC Simulations
These is extremely easy to simulate this mean field Potts model since we only need to keep track of the ratio to have an accurate picture of the system. For example, a typical Markov Chain Monte Carlo approach would run as follows:
- choose a particle
uniformly at random in
- try to switch its value uniformly in
- compute the Metropolis ratio
- update accordingly.
If we do that times for
states at inverse temperature
and for
particles (which is fine since we only need to keep track of the
-dimensional ratio vector) and plot the result in barycentric coordinates we get a picture that looks like:
Here I started with a configuration where all the particles were in the same states i.e ratio vector equal to . We can see that even with
steps, the algorithm struggles to go from one most probable position
to the other two
and
– in this simulation, one of the most probable state has even not been visited! Indeed, this approach was extremely naive, and this is quite interesting to try to come up with better algorithms. Btw, Christian Robert’s blog has tons of interesting stuffs related to MCMC and how to boost up the naive approach presented here.
Challenge – Announcement
June 6, 2010 at 12:44 pm (17x17 challenge, Monte Carlo, probability)
Tags: 17x17 challenge, MCMC, Monte Carlo, optimization, simulated annealing, William Gasarch
Have you already heard about the 17×17 Challenge ? It all started 7 months ago, and still no solution has been found. I have the feeling though that no really serious attempt at solving the problem has been done, even if very encouraging results have been found. It could be very interesting to keep track of the progress made so that results will be published here. Send what you have reached so far!
In the next few weeks, I will try to write about the Monte Carlo approaches described in the excellent book “Monte Carlo Strategies in Scientific Computing” by Jun Liu: I do not think that one can easily solve the problem with a direct implementation of these methods, but I want to know more about it. These are all more or less Markov Chain Monte Carlo Methods: I have tried the “parallel tempering” method, which gives results that are not that bad, and was planning to check the “flat histogram” method. “Simulated annealing” methods have succesfully been used to solve the Sudoku problem though: see Christian Robert blog for example. I am also planning to continue my reading of the fascinating book “Information, Physics, and Computation” by Marc Mezard: methods described in this book (belief propagation, k-SAT problem, etc…) are more adapted to the 17×17 challenge I think, and I am quite optimistic about it.
Read also here,here,here,here and there – and have a try with this on-line simulator !
Random permutations and giant cycles
June 3, 2010 at 11:07 pm (coagulation, fluid limit, fragmentation, markov chain, probability, random permutation)
Tags: Erdos Renyi, giant cycle, markov chain, phase transition, random graph, random permutation
The other day, Nathanael Berestycki presented some very nice results on random permutations. Start with the identity permutation of and compose with random transpositions: in short, at time
we consider the random permutation
where
is a transposition chosen uniformly at random among the
transpositions of
. How long do we have to wait before seeing a permutation that has a cycle of length greater than
, where
is a fixed constant ?
The answer has been known for quite a long time (O. Schramm,2005), but no simple proof was available: the highlight of Nathanael’s talk was a short and elegant proof of the fact that we only have to wait a little bit more than steps to see this giant cycle! The first half of the proof is such a beauty that I cannot resist in sharing it here!
1. Visual representation
First, this is more visual to represent a permutation by its cycle decomposition. Consider for example the permutation that sends
etc… It can be graphically represented by the following pictures:
If this permutation is composed with the transposition , this gives the following cycle structure:
If it is composed with this gives,
It is thus pretty clear that the cycle structure of can be obtained from the cycle structure of
in only 2 ways:
- or two cycles of
merge into one,
- or one cycle of
breaks into two cycles.
In short, the process behaves like a fragmentation-coalescence processes. At the beginning, the cycle structure of
is trivial: there are
trivial cycles of length one! Then, as times goes by, they begin to merge, and sometimes they also break down. At each step two integers
and
are chosen uniformly at random (with
): if
and
belong to the same cycle, this cycle breaks into two parts. Otherwise the cycle that contains
merges with the cycle that contains
. Of course, since at the beginning all the cycles are extremely small, the probability that a cycles breaks into two part is extremely small: a cycle of size
breaks into two part with probability roughly equal to
.
2. Comparison with the Erdos-Renyi random Graph
Since the fragmentation part of the process can be neglected at the beginning, this is natural to compare the cycle structure at time with the rangom graph
where
: this graph has
vertices and roughly
edges since each vertex is open with probability
. A coupling argument makes this idea more precise:
- start with
and the graph
that has
disconnected vertices
- choose two different integers
- define
- to define
, take
and join the vertices
and
(never mind if they were already joined)
- go to step
.
The fundamental remark is that each cycle of is included in a connected component of
: this is true at the beginning, and this property obviously remains true at each step. The Erdos-Renyi random graph
is one of the most studied object: it is well-known that for
, with probability tending to
, the size of the largest connected component of
is of order
. Since
, this immediately shows that for
, with
the length of the longuest cycle in
is of order
or less. This is one half of the result. It remains to work a little bit to show that if we wait a little bit more than
, the first cycle of order
appears.
For a permutation with cycle decomposition
(cycles of length
included), one can consider the quantity
This is a quantity between (for the identity) and
that measures in some extends the sizes of the cycles. In order to normalize everything and to take into account that the cycle structure abruptly changes near the critical value
, one can also observe this quantity on a logarithmic scale:
This quantity lies between
and
, and evolves quite smoothly with time: look at what it looks like, once the time has been rescaled by a factor
:
The plot represented the evolution of the quantity between time
and
for dimensions
.
Question: does the random quantity converge (in probability) towards a non trivial constant
? It must be the analogue quantity for the random graph
, so I bet it is a known result …
Doob H-transforms
June 2, 2010 at 11:21 am (Brownian motion, Markov process, probability, Stochastic analysis)
Tags: brownian motion, conditioning, Doob, h-transform, Markov process, probability
I read today about Doob h-transforms in the Rogers-Williams … It is done quite quickly in the book so that I decided to practice on some simple examples to see how this works.
So we have an Markov process living in
, and we want to see how this process looks like if we condition on the event
, say. To fix the notations we define
and
. The conditioned semi-group
is quite easily computed from
and
. Indeed, this is also equal to
Notice also that is indeed a Markov kernel in the sense that
: the only property needed for that is
In fact, we could take any function that satisfies this equality and define a new Markovian kernel
and study the associated Markov process. That’s what people usually do by the way.
Remark 1 we almost never know explicitly the quantity
, except in some extremely simple cases !
Before trying these ideas on some simple examples, let us see what this says on the generator of the process:
- continuous time Markov chains, finite state space:let us suppose that the intensity matrix is
and that we want to know the dynamic on
of this Markov chain conditioned on the event
. Indeed
so that
so that in the limit we see that at time
, the intensity of the jump from
to
of the conditioned Markov chain is
Notice how this behaves while
: if at
the Markov chain is in state
then the intensity of jump from
to
is equivalent to
.
- diffusion processes:this time consider a
-dimensional diffusion
on
conditioned on the event
and define as before
. The generator of the (non-homogeneous) conditioned diffusion is defined at time
by
so that if
is the generator of the original diffusion we get
Because
, this also reads
This means that the conditioned diffusion
follows the SDE:
The volatility function remains the same while an additional drift shows up.
We will try these ideas on some examples where the probability densities are extremely simple. Notice that in the case of diffusions, if we take , the function
is identically equal to
(except degenerate cases): to condition on the event
we need instead to take
to be the transition probability
. This follows from the approximation
. Let’s do it:
- Brownian Bridge on
:in this case
so that the additional drift reads
: a Brownian bridge follows the SDE
This might not be the best way to simulate a Brownian bridge though!
- Poisson Bridge on
:we condition a Poisson process of rate
on the event
. The intensity matrix is simply
and
everywhere else while the transition probabilities are given by
. This is why at time
, the intensity from
to
is given by
Again, that might not be the most efficient way to simulate a Poisson Bridge ! Notice how the intensity
has disappeared …
- Ornstein-Uhlenbeck Bridge:Let’s consider the usual OU process given by the dynamic
: the invariant probability is the usual centred Gaussian distribution. Say that we want to know how does such an OU process behave if we condition on the event
. Because
we find that the conditioned O-U process follows the SDE
If we Taylor expand the additonal drift, it can be seen that this term behaves exactly as in the case of the Brownian bridge. Below is a plot of an O-U process conditioned on the event
, starting from
.





