# 14.3 Coalescing random walks and the voter model

Sections 14.3 and 14.4 treat some models whose behavior relates “by duality” to random-walk-type processes. It is possible (see Notes) to fit all our examples into an abstract duality framework, but for the sake of concreteness I haven’t done so. Note that for simplicity we work in the setting of regular graphs, though the structural results go over to general graphs and indeed to weighted graphs.

Fix a $r$-regular $n$-vertex graph $G$. In the voter model we envisage a person at each vertex. Initially each person has a different opinion (person $i$ has opinion $i$, say). As time passes, opinions change according to the following rule. For each person $i$ and each time interval $[t,t+dt]$, with chance $dt$ the person chooses uniformly at random a neighbor ($j$, say) and changes (if necessary) their opinion to the current opinion of person $j$. Note that the total number of existing opinions can only decrease with time, and at some random time $C_{{\bf vm}}$ there will be only one “consensus” opinion.

In the coalescing random walk process, at time $0$ there is one particle at each vertex. These particles perform independent continuous-time random walks on the graph, but when particles meet they coalesce into clusters and the cluster thereafter sticks together and moves as a single random walk. So at time $t$ there are clusters, composed of one or more particles, at distinct vertices, and during $[t,t+dt]$ each cluster has chance $dt$ to move to a random neighbor and (if that neighbor is occupied by another cluster) to coalesce with that other cluster. Note that the total number of clusters can only decrease with time, and at some random time $C_{{\bf crw}}$ the particles will have all coalesced into a single cluster.

Remarkably, the two random variables $C_{{\bf vm}}$ and $C_{{\bf crw}}$ associated with the two models turn out to have the same distribution, depending only on the graph $G$. The explanation is that the two processes can be obtained by looking at the same picture in two different ways. Here’s the picture. For each edge $e$ and each direction on $e$, create a Poisson process of rate $1/r$. In the figure, $G$ is the $8$-cycle, “time” is horizontal and an event of the Poisson process for edge $(i,j)$ at time $t$ is indicated by a vertical arrow $i\rightarrow j$ at time $t$.

In the voter model, we interpret time as increasing left-to-right from $0$ to $t_{0}$, and we interpret an arrow $j\rightarrow i$ at time $t$ as meaning that person $j$ adopts $i$’s opinion a time $t$. In the coalescing random walk model, we interpret time as increasing right-to-left from $0$ to $t_{0}$, and we interpret an arrow $j\rightarrow i$ at time $t$ as meaning that the cluster (if any) at state $j$ at time $t$ jumps to state $i$, and coalesces with the cluster at $i$ (if any).

So for fixed $t_{0}$, we can regard both processes as constructed from the same Poisson process of “arrows”. For any vertices $i,j,k$ the event (for the voter model)

The opinions of persons $i$ and $j$ at time $t_{0}$ are both the opinion initially held by $k$

is exactly the same as the event (for the coalescing random walk process)

The particles starting at $i$ and at $j$ have coalesced before time $t_{0}$ and their cluster is at vertex $k$ at time $t_{0}$.

The horizontal lines in the figure indicate part of the trajectories. In terms of the coalescing random walks, the particles starting at $5$ and $7$ coalesce, and the cluster is at $4$ at time $t_{0}$. In terms of the voter model, the opinion initially held by person $4$ is held by persons $5$ and $7$ at time $t_{0}$. The reader may (provided this is not a library book) draw in the remaining trajectories, and will find that exactly $3$ of the initial opinions survive, i.e. that the random walks coalesce into $3$ clusters.

In particular, the event (for the voter model)

By time $t_{0}$ everyone’s opinion is the opinion initially held by person $k$

is exactly the same as the event (for the coalescing random walk process)

All particles have coalesced by time $t_{0}$, and the cluster is at $k$ at time $t_{0}$.

So $P(C_{{\bf vm}}\leq t_{0})=P(C_{{\bf crw}}\leq t_{0})$, and these two times (which we shall now call just $C$) do indeed have the same distribution.

We now discuss bounds on $EC$. It is interesting that the two models give us quite different ways to prove bounds. Bounding $EC$ here is somewhat analogous to the problem of bounding mean cover time, discussed in Chapter 6.

## 14.3.1 A bound using the voter model

Recall from Chapter 4 yyy the definition of the Cheeger time constant $\tau_{c}$. In the present setting of a $r$-regular graph, the definition implies that for any subset $A$ of vertices

 $A$ (14.11)
###### Proposition 14.9

(a) If $G$ is $s$-edge-connected then $EC\leq\frac{rn^{2}}{4s}$.

(b) $EC\leq 2\log 2\ \tau_{c}n$.

Proof. The proof uses two ideas. The first is a straightforward comparison lemma.

###### Lemma 14.10

Let $(X_{t})$ be a continuous-time chain on states $I$. Let $f:I\rightarrow\{0,1,\ldots,n\}$ be such that $f(X_{t})$ never jumps by more than $1$, and such that there exist strictly positive constants $\gamma,a(1),\ldots,a(n-1)$ such that, for each $1\leq i\leq n-1$ and each state $x$ with $f(x)=i$,

 $\frac{P(f(X_{t+dt})=i+1|X_{t}=x)}{dt}=\frac{P(f(X_{t+dt})=i-1|X_{t}=x)}{dt}% \geq\gamma a(i).$

Then

 $E_{x}T_{\{f^{-1}(0),f^{-1}(n)\}}\leq\gamma^{-1}E^{*}_{f(x)}T^{*}_{\{0,n\}}$

where $E^{*}T^{*}$ refers to mean hitting time for the chain $X^{*}$ on states $\{0,1,\ldots,n\}$ with transition rates

 $q_{i,i+1}=q_{i,i-1}=a(i).$

The second idea is that our voter model can be used to define a less-informative “two-party” model. Fix an initial set $B$ of vertices, and group the opinions of the individuals in $B$ into one political party (“Blues”) and group the remaining opinions into a second party (“Reds”). Let $N^{B}_{t}$ be the number of Blues at time $t$ and let $C^{B}\leq C$ be the first time at which everyone belongs to the same party. Then

 $\displaystyle P(N^{B}_{t+dt}=N^{B}_{t}+1|\mbox{ configuration at time }t)$ $\displaystyle=P(N^{B}_{t+dt}=N^{B}_{t}-1|\mbox{ configuration at time }t)$ $t$ (14.12)

Cases (a) and (b) now use Lemma 14.10 with different comparison chains. For (a), while both parties coexist, the number of edges being counted in (14.12) is at least $s$. To see this, fix two vertices $v,x$ of different parties, and consider (c.f. Chapter 6 yyy) a collection of $s$ edge-disjoint paths from $v$ to $x$. Each path must contain at least one edge linking Blue to Red. Thus the quantity (14.12) is at least $\frac{s}{r}\ dt$. If that quantity were $\frac{1}{2}\ dt$ then $N^{B}_{t}$ would be continuous time random walk on $\{0,\ldots,n\}$ and the quantity $EC^{B}$ would be the mean time, starting at $|B|$, for simple random walk to hit $0$ or $n$, which by Chapter 5 yyy we know equals $|B|(n-|B|)$. So using Lemma 14.10

 $EC^{B}\leq\frac{r}{2s}|B|(n-|B|)\leq\frac{rn^{2}}{8s}.$ (14.13)

For (b), use (14.11) to see that the quantity (14.12) must be at least $\frac{N^{B}_{t}(n-N^{B}_{t})}{n\tau_{c}}\ dt$. Consider for comparison the chain on $\{0,\ldots,n\}$ with transition rates $q_{i,i+1}=q_{i,i-1}=i(n-i)/n$. For this chain

 $\displaystyle E^{*}_{i}T^{*}_{\{0,n\}}$ $\displaystyle=$ $j$ $\displaystyle=$ $\displaystyle\sum_{j=1}^{n-1}m_{i}(j)\frac{\frac{1}{2}}{j(n-j)/n}$

where $m_{i}(j)$ is the mean occupation time for simple symmetric random walk and the second term is the speed-up factor for the comparison chain under consideration. Using the formula for $m_{i}(j)$ from Chapter 5 yyy,

 $E^{*}_{i}T^{*}_{\{0,n\}}=i\sum_{j=i}^{n-1}\frac{1}{j}+(n-i)\sum_{j=1}^{i-1}% \frac{1}{n-j}\leq n\log 2.$

So using Lemma 14.10

 $EC^{B}\leq\tau_{c}n\log 2.$ (14.14)

Finally, imagine choosing ${\bf B}$ at random by letting each individual initially be Blue or Red with probability $1/2$ each, independently for different vertices. Then by considering some two individuals with different opinions at time $t$,

 $P(C^{\bf B}>t)\geq\frac{1}{2}P(C>t).$

Integrating over $t$ gives $EC\leq 2EC^{\bf B}$. But $EC^{\bf B}\leq\max_{B}EC^{B}$, so the Proposition follows from (14.13) and (14.14).

## 14.3.2 A bound using the coalescing random walk model

The following result bounds the mean coalescing time in terms of mean hitting times of a single random walk.

###### Proposition 14.11

$EC\leq e(\log n+2)\max_{i,j}E_{i}T_{j}$

Proof. We can construct the coalescing random walk process in two steps. Order the vertices arbitrarily as $i_{1},\ldots,i_{n}$. First let the $n$ particles perform independent random walks for ever, with the particles starting at $i,j$ first meeting at time $M_{i,j}$, say. Then when two particles meet, let them cluster and follow the future path of the lower-labeled particle. Similarly, when two clusters meet, let them cluster and follow the future path of the lowest-labeled particle in the combined cluster. Using this construction, we see

 $C_{{\bf crw}}\leq\max_{j}M_{i_{1},j}.$ (14.15)

Now let $m^{*}\equiv\max_{i,j}EM_{i,j}$. Using subexponentiality as in Chapter 2 section yyy,

 $P(M_{i,j}>t)\leq\exp(-\lfloor{t\over em^{*}}\rfloor).$ (14.16)

and so

 $\displaystyle EC$ $\displaystyle=$ $\displaystyle\int_{0}^{\infty}P(C>t)dt$ $\displaystyle\leq$ $\displaystyle\int_{0}^{\infty}\min(1,\sum_{j}P(M_{i_{1},j}>t))\ dt\mbox{ by }(% \ref{CM})$ $\displaystyle\leq$ $\displaystyle\int_{0}^{\infty}\min(1,ne\exp(-\frac{t}{em^{*}}))\ dt\mbox{ by }% (\ref{Mij})$ $\displaystyle=$ $\displaystyle em^{*}(2+\log n)$

where the final equality is the calculus fact

 $\int_{0}^{\infty}\min(1,Ae^{-at})\ dt=a^{-1}(1+\log A),\ \ \ A\geq 1.$

The result now follows from Proposition 14.5.

## 14.3.3 Conjectures and examples

The complete graph. On the complete graph, the number $K_{t}$ of clusters at time $t$ in the coalescing random walk model is itself the continuous-time chain with transition rates

 $q_{k,k-1}=k(k-1)/(n-1);\ \ n\geq k\geq 2.$

Since $C_{{\bf crw}}$ is the time taken for $K_{t}$ to reach state $1$,

 $EC=\sum_{k=2}^{m}\frac{n-1}{k(k-1)}=\frac{(n-1)^{2}}{n}\sim n.$

Recall from Chapter 7 yyy that in a vertex-transitive graph with $\tau_{2}/\tau_{0}$ small, the first hitting time to a typical vertex has approximately exponential distribution with mean $\tau_{0}$. Similarly, the meeting time $M_{i,j}$ for typical $i,j$ has approximately exponential distribution with mean $\tau_{0}/2$. It seems intuitively clear that, for fixed small $k$, when the number of clusters first reaches $k$ these clusters should be approximately uniformly distributed, so that the mean further time until one of the $k(k-1)/2$ pairs coalesce should be about $\frac{\tau_{0}}{k(k-1)}$. Repeating the analysis of the complete graph suggests

###### Open Problem 14.12

Prove that for a sequence of vertex-transitive graphs with $\tau_{2}/\tau_{0}\rightarrow 0$, we have $EC\sim\tau_{0}$.

In the general setting, there is good reason to believe that the log term in Proposition 14.11 can be removed.

###### Open Problem 14.13

Prove there exists an absolute constant $K$ such that on any graph

 $EC\leq K\max_{v,w}E_{v}T_{w}.$

The assertion of Open Problem 14.12 in the case of the torus $Z^{d}_{m}$ for $d\geq 2$ was proved by Cox [104]. A detailed outline is given in [132] Chapter 10b, so we will not repeat it here, but see the remark in section 14.3.5 below.

xxx discuss $d=1$?

## 14.3.4 Voter model with new opinions

For a simple variation of the voter model, fix a parameter $0<\lambda<\infty$ and suppose that each individual independently decides at rate $\lambda$ (i.e. with chance $\lambda dt$ in each time interval $[t,t+dt]$) to adopt a new opinion, not previously held by anyone. For this process we may take as state space the set of partitions ${\bf A}=\{A_{1},A_{2},\ldots\}$ of the vertex-set of the underlying graph $G$, where two individuals have the same opinion iff they are in the same component $A$ of ${\bf A}$. The duality relationship holds with the following modification. In the dual process of coalescing random walks, each cluster “dies” at rate $\lambda$. Thus in the dual process run forever, each “death” of a cluster involves particles started at some set $A$ of vertices, and this partition ${\bf A}=\{A_{i}\}$ of vertices into components is (by duality) distributed as the stationary distribution of the voter model with new opinions. This is the unique stationary distribution, even though (e.g. on the $n$-cycle) the Markov chain may not be irreducible because of the existence of transient states.

The time to approach stationarity in this model is controlled by the time $\tilde{C}$ for the dual process to die out completely. Clearly $E\tilde{C}\leq EC+1/\lambda$, where $C$ is the coalescing time discussed in previous sections, and we do not have anything new to say beyond what is implied by previous results. Instead, we study properties of the stationary distribution ${\bf A}=\{A_{i}\}$. A natural parameter is the chance, $\gamma$ say, that two random individuals have the same opinion, i.e.

 $\gamma\equiv E\sum_{i}\frac{|A_{i}|^{2}}{n^{2}}.$ (14.17)
###### Lemma 14.14
 $\gamma=\frac{2E\mbox{{\cal E}}}{\lambda rn^{2}}+\frac{1}{n},$

where ${\cal E}$ is the number of edges with endpoints in different components, under the stationary distribution.

Proof. Run the stationary process, and let ${\bf A}(t)$ and $\mbox{{\cal E}}(t)$ be the partition and the number of edges linking distinct components, at time $t$, and let $S(t)=\sum_{i}|A_{i}(t)|^{2}$. Then

 $\frac{E(S^{2}(t+dt)-S^{2}(t)|\mbox{ configuration at time }t)}{dt}$
 $=\frac{4}{r}\mbox{{\cal E}}(t)+2\lambda\sum_{i}|A_{i}(t)|(1-|A_{i}(t)|).$ (14.18)

The first term arises from the “voter” dynamics. If an opinion change involves an edge linking components of sizes $a$ and $b$, then the change in $S^{2}$ has expectation

 $\frac{(a+1)^{2}+(a-1)^{2}+(b+1)^{2}-(b-1)^{2}}{2}-(a^{2}+b^{2})=2$

and for each of the $\mbox{{\cal E}}(t)$ edges linking distinct components, opinion changes occur at rate $2/r$. The second term arises from new opinions. A new opinion occurs in a component of size $a$ at rate $\lambda a$, and the resulting change in $S^{2}$ is

 $(a-1)^{2}+1^{2}-a^{2}=2(1-a).$

Stationarity implies that the expectation of (14.18) equals zero, and so

 $\frac{4}{r}E\mbox{{\cal E}}=2\lambda\sum_{i}E|A_{i}|(|A_{i}|-1)=2\lambda(n^{% 2}\gamma-n)$

and the lemma follows.

###### Corollary 14.15

$\frac{1+\lambda\tau_{c}}{1+\lambda\tau_{c}n}\leq\gamma\leq\frac{\lambda+1}{% \lambda n}$.

Proof. Clearly $E\mbox{{\cal E}}$ is at most the total number of edges, $nr/2$, so the upper bound follows from the lemma. For the lower bound, (14.11) implies

 $\xi\geq\frac{r\sum_{i}|A_{i}|(n-|A_{i}|)}{2n\tau_{c}}$

and hence

 $E\mbox{{\cal E}}\geq\frac{r}{2n\tau_{c}}(n^{2}-n^{2}\gamma)$

and the bound follows from the lemma after brief manipulation.

We now consider bounds on $\gamma$ obtainable by working with the dual process. Consider the meeting time $M$ of two independent random walks started with the stationary distribution. Then by duality (xxx explain)

 $\gamma=P(M<\xi_{(2\lambda)})$

where $\xi_{(2\lambda)}$ denotes a random variable with exponential $(2\lambda)$ distribution independent of the random walks. Now $M$ is the hitting time of the stationary “product chain” (i.e. two independent continuous-time random walks) on the diagonal $A=\{(v,v)\}$, so by Chapter 3 yyy $M$ has completely monotone distribution, and we shall use properties of complete monotonicity to get

###### Corollary 14.16
 $\frac{1}{1+2\lambda EM}\leq\gamma\leq\frac{1}{1+2\lambda EM}+\frac{\tau_{2}}{% EM}.$

Proof. We can write $M\ \stackrel{d}{=}\ R\xi_{(1)}$, where $\xi_{(1)}$ has exponential$(1)$ distribution and $R$ is independent of $\xi_{(1)}$. Then

 $\displaystyle\gamma$ $\displaystyle=$ $\displaystyle P(R\xi_{(1)}<\xi_{(2\lambda)}$ $\displaystyle=$ $\displaystyle E\ P(R\xi_{(1)}<\xi_{(2\lambda)}|R)$ $\displaystyle=$ $\displaystyle E\frac{1}{1+2\lambda R}$ $\displaystyle\geq$ $\displaystyle\frac{1}{1+2\lambda ER}\mbox{ by Jensen's inequality}$ $\displaystyle=$ $\displaystyle\frac{1}{1+2\lambda EM}.$

For the upper bound, apply Chapter 3 yyy to the product chain to obtain

 $P(M>t)\geq\exp(-t/EM)-\tau_{2}/EM$

(recall that $\tau_{2}$ is the same for the product chain as for the underlying random walk). So

 $\displaystyle 1-\gamma$ $\displaystyle=$ $\displaystyle P(M\geq\xi_{(2\lambda)})$ $\displaystyle=$ $\displaystyle\int_{0}^{\infty}P(M\geq t)\ 2\lambda e^{-2\lambda t}dt$ $\displaystyle\geq$ $\displaystyle\frac{2\lambda EM}{1=2\lambda EM}-\frac{\tau_{2}}{EM}$

and the upper bound follows after rearrangement.

Note that on a vertex-transitive graph Proposition 14.5 implies $EM=\tau_{0}/2$. So on a sequence of vertex-transitive graphs with $\tau_{2}/\tau_{0}\rightarrow 0$ and with $\lambda\tau_{0}\rightarrow\theta$, say, Corollary 14.16 implies $\gamma\rightarrow\frac{1}{1+\theta}$. But in this setting we can say much more, as the next section will show.

## 14.3.5 Large component sizes in the voter model with new opinions

xxx discuss coalescent, GEM and population genetics.

xxx genetics already implicit in xxx

Fix $0<\theta<\infty$. take independent random variables $(\xi_{i})$ with distribution

 $P(\xi>x)=(1-x)^{\theta},\ \ 0

and define

 $(X^{(\theta)}_{1},X^{(\theta)}_{2},X^{(\theta)}_{3},\ldots)=(\xi_{1},(1-\xi_{1% })\xi_{2},(1-\xi_{1})(1-\xi_{2})\xi_{3},\ldots)$

so that $\sum_{i}X^{(\theta)}_{i}=1$.

###### Proposition 14.17

Consider a sequence of vertex-transitive graphs for which $\tau_{2}/\tau_{0}\rightarrow 0$. Consider the stationary distribution ${\bf A}$ of the voter model with new opinions, presented in size-biased order. If $\lambda\tau_{0}\rightarrow\theta$ then

 $\left(\frac{|A_{1}|}{n},\ldots,\frac{|A_{k}|}{n}\right)\ \stackrel{d}{% \rightarrow}\ (X^{(\theta)}_{1},\ldots,X^{(\theta)}_{k})\mbox{ for all fixed }k.$

xxx proof

Remark. The same argument goes halfway to proving Open Problem 14.12, by showing

###### Corollary 14.18

Consider a sequence of vertex-transitive graphs for which $\tau_{2}/\tau_{0}\rightarrow 0$. Let $C^{(k)}$ be the coalescing time for $k$ walks started at independent uniform positions. Then, for fixed $k$, $EC^{(k)}\sim\tau_{0}(1-k^{-1})$.

xxx argument similar (?) to part of the proof in Cox [104] for the torus.

## 14.3.6 Number of components in the voter model with new opinions

xxx $\tau_{c}$ result