## Fluid Limits and Random Graphs

The other day Christina Goldschmidt gave a very nice talk on results obtained by James Norris and R. W. R. Darling: in their paper they use fluid limit techniques in order to study properties a Random graphs. This approach has long been used to analyse queuing systems for example.

1. Erdos Renyi Random Graph model

Remember that the Erdos Renyi random graph ${\mathcal{G}(N,p)}$ is a graph with ${N}$ vertices ${\{1,2, \ldots, N\}}$ such that any two of them are connected with probability ${p}$, independently.

This model of random graphs has been extensively studied and this has long been known that ${G(N,\frac{c}{N})}$ undergoes a phase transition at ${c=1}$: for ${c<1}$, the largest component ${L^{(N,c)}}$ of ${\mathcal{G}(N,\frac{c}{N})}$ has a size of order ${\log(N)}$ while for ${c>1}$ the largest connected component ${L^{(N,c)}}$, also called the giant component, has size of order ${N}$. In another blog entry, another consequence of this phase transition has been used.

One can even show that for ${c>1}$ there exists a constant ${K_c}$ such that for any ${\epsilon > 0}$, $\displaystyle \lim_{N \rightarrow \infty} \; \; \mathop{\mathbb P}\Big( |\frac{\textrm{Card}(L^{(N,c)})}{N} - K_c| > \epsilon \Big) = 0.$

Simply put, the largest connected component of ${\mathcal{G}(N,\frac{c}{N})}$ has size approximatively equal to ${K_c \cdot N}$. Maybe surprisingly, this is quite simple to compute the value of ${K(c)}$ through fluid limit techniques.

2. Exploration of the graph

Let ${G}$ be a configuration of ${\mathcal{G}(N, \frac{c}{N})}$. This configuration is explored through a classical depth-first algorithm. To keep track of what is doing, a process ${Z_k: \{1, \ldots, N\} \rightarrow \mathbb{Z}}$ is introduced. This process captures essentially all the information needed to study the size of the connected component: it is then shown that ${Z}$ behaves essentially deterministically when looked under the appropriate scale.

1. initilisation: t=0: Color all the vertices in green, the current vertex is ${v=1}$ and define ${Z_1=0}$.
2. t=t+1: Color the current vertex in red (=visited): if a vertex is connected to ${v}$ and is green, color it in grey ( this vertex has still to be visited, but its ‘parent’ has already been visited): these are the children of ${v}$.
3. update ${Z}$ the following way, ${Z_t = Z_{t-1} + (\textrm{Nb of children of }v) - 1}$, and
• if ${v}$ has children, choose one of them and make it the new current vertex. Go to ${2}$.
• If ${v}$ has no children then the new current vertex is its parent. Go to ${2}$.
• if ${v}$ has no children and no parent, the new current vertex is one of the remaining (if any) green vertex. Go to ${2}$
• else: we are finished.

A moment of though reveals that ${Z_1=0}$ and ${Z}$ reaches ${-1}$ precisely when the connected component of the vertex ${1}$ has all been explored. Then ${Z}$ reaches ${-2}$ precisely when the second connected component has all been explored, etc … This is why if ${T_0=1}$ and ${T_k = \inf \{t: Z_t = -k\}}$ then the connected components have size exactly ${S_k = T_{k+1}-T_k}$ for ${k=0,1,2, \ldots}$. The largest component has size ${\max_k S_k}$.

3. Fluid Limit

It is quite easy to show why the rescaled process ${z^N(t) = \frac{ Z([Nt])}{N}}$ should behave like a differential equation (ie: fluid limit). While exploring the first component, ${Z_k}$ represents the number of gray vertices at time ${k}$ so that ${N-k-Z_k}$ represents the number of green vertices: this is why $\displaystyle \mathop{\mathbb E}\Big[Z_{k+1}-Z_k \;|Z_k \Big] = (N-k-Z_k) \frac{c}{N} = c\Big(1-\frac{k}{N} - \frac{Z_k}{N} \Big) - 1$

so that ${\mathop{\mathbb E}\Big[ \frac{z^N(t+\Delta)-z^N(t)}{\Delta} | z^N(t) \Big] \approx c\Big(1-t-z(t) \Big) - 1}$ where ${\Delta = \frac{1}{N}}$. Also, the variance statisfies ${\text{Var}\Big[ \frac{z^N(t+\Delta)-z^N(t)}{\sqrt{\Delta}} | z^N(t) \Big] \approx N^{-\frac{1}{2}} c\Big(1-t-z(t) \Big) \rightarrow 0}$ so that all the ingredients for a fluit limit are present: ${z^N}$ converges to the differential equation $\displaystyle \frac{d}{dt} z(t) = c\Big(1-t-z(t) \Big) - 1$

whose solution is ${z(t) = (1-t)-e^{-ct}}$. This is why ${K=K_c > 0}$ is implicitly defined as the solution of $\displaystyle 1-K_c = e^{-c K_c}.$

As expected, ${\lim_{c \rightarrow 1^+} K_c = 0}$ and ${\lim_{c \rightarrow +\infty} K_c = 1}$.