14 Interacting Particles on Finite Graphs (March 10, 1994)

14.1 Coupling

If XX and YY are random variables with Binomial (n,p1)(n,p_{1}) and (n,p2)(n,p_{2}) distributions respectively, and if p1p2p_{1}\leq p_{2}, then it is intuitively obvious that

P(Xx)P(Yx) for all x.P(X\geq x)\leq P(Y\geq x)\mbox{ for all }x. (14.1)

One could verify this from the exact formulas, but there is a more elegant non-computational proof. For 1in1\leq i\leq n define events (Ai,Bi,Ci)(A_{i},B_{i},C_{i}), independent as ii varies, with P(Ai)=p1,P(Bi)=p2-p1,P(Ci)=1-p2P(A_{i})=p_{1},P(B_{i})=p_{2}-p_{1},P(C_{i})=1-p_{2}. And define

X=i1Ai\displaystyle X^{\prime}=\sum_{i}1_{A_{i}} =\displaystyle= number of A’s which occur
Y=i1AiBi\displaystyle Y^{\prime}=\sum_{i}1_{A_{i}\cup B_{i}} =\displaystyle= number of A’s and B’s which occur.

Then XYX^{\prime}\leq Y^{\prime}, so (14.1) holds for XX^{\prime} and YY^{\prime}, but then because X=dXX^{\prime}\ \stackrel{d}{=}\ X and Y=dYY^{\prime}\ \stackrel{d}{=}\ Y we have proved that (14.1) holds for XX and YY. This is the prototype of a coupling argument, which (in its wide sense) means

to prove some distributional inequality relating two random processes X,YX,Y by constructing versions X,YX^{\prime},Y^{\prime} which satisfy some sample path inequality.

Our first “process” example is a somewhat analogous proof of part (a) of the following result, which abstracts slightly a result stated for random walk on distance-regular graphs (Chapter 7 Proposition yyy).

Proposition 14.1

Let (Xt)(X_{t}) be an irreducible continuous-time birth-and-death chain on states {0,1,,Δ}\{0,1,\ldots,\Delta\}.

(a) P0(Xt=i)πi is non-increasing in ii, for fixed tt \frac{P_{0}(X_{t}=i)}{\pi_{i}}\mbox{ is non-increasing in $i$, for fixed $t$ }

(b) P0(Xt=i)P0(Xt=0) is non-decreasing in tt, for fixed ii \frac{P_{0}(X_{t}=i)}{P_{0}(X_{t}=0)}\mbox{ is non-decreasing in $t$, for % fixed $i$ }

Proof. Fix i1i2i_{1}\leq i_{2}. Suppose we can construct processes YtY_{t} and ZtZ_{t}, distributed as the given chain started at i1i_{1} and i2i_{2} respectively, such that

YtZt for all t.Y_{t}\leq Z_{t}\mbox{ for all }t. (14.2)

Then

Pi1(Xt=0)=P(Yt=0)P(Zt=0)=Pi2(Xt=0).P_{i_{1}}(X_{t}=0)=P(Y_{t}=0)\geq P(Z_{t}=0)=P_{i_{2}}(X_{t}=0).

But by reversibility

Pi1(Xt=0)=π0πi1P0(Xt=i1)P_{i_{1}}(X_{t}=0)=\frac{\pi_{0}}{\pi_{i_{1}}}\ P_{0}(X_{t}=i_{1})

and similarly for i2i_{2}, establishing (a).

Existence of processes satisfying (14.2) is a consequence of the Doeblin coupling discussed below. The proof of part (b) involves a different technique and is deferred to section 14.1.3.

14.1.1 The coupling inequality

Consider a finite-state chain in discrete or continuous time. Fix states i,ji,j. Suppose we construct a joint process (Xt(i),Xt(j);t0)(X^{(i)}_{t},X^{(j)}_{t};t\geq 0) such that

(Xt(i),t0)(X^{(i)}_{t},t\geq 0) is distributed as the chain started at ii
(Xt(j),t0)(X^{(j)}_{t},t\geq 0) is distributed as the chain started at jj.\displaystyle\mbox{$(X^{(j)}_{t},t\geq 0)$ is distributed as the chain started% at $j$}. (14.3)

And suppose there is a random time TT\leq\infty such that

Xt(i)=Xt(j),  Tt<.X^{(i)}_{t}=X^{(j)}_{t},\ T\leq t<\infty. (14.4)

Call such a TT a coupling time. Then the coupling inequality is

||Pi(Xt)-Pj(Xt)||P(T>t), 0t<.||P_{i}(X_{t}\in\cdot)-P_{j}(X_{t}\in\cdot)||\leq P(T>t),\ 0\leq t<\infty. (14.5)

The inequality is clear once we observe P(Xt(i),Tt)=P(Xt(j),Tt)P(X_{t}^{(i)}\in\cdot,T\leq t)=P(X_{t}^{(j)}\in\cdot,T\leq t). The coupling inequality provides a method of bounding the variation distance d¯(t)\bar{d}(t) of Chapter 2 section yyy.

The most common strategy for constructing a coupling satisfying (14.3) is via Markov couplings, as follows. Suppose the underlying chain has state space II and (to take the continuous-time case) transition rate matrix 𝐐=(q(i,j)){\bf Q}=(q(i,j)). Consider a transition rate matrix 𝐐~\tilde{{\bf Q}} on the product space I×II\times I. Write the entries of 𝐐~\tilde{{\bf Q}} as q~(i,j;k,l)\tilde{q}(i,j;k,l) instead of the logical-but-fussy q~((i,j),(k,l))\tilde{q}((i,j),(k,l)). Suppose that, for each pair (i,j)(i,j) with jij\neq i,

q~(i,j;,) has marginals q(i,) and q(j,)\tilde{q}(i,j;\cdot,\cdot)\mbox{ has marginals }q(i,\cdot)\mbox{ and }q(j,\cdot) (14.6)

in other words lq~(i,j;k,l)=q(i,k)\sum_{l}\tilde{q}(i,j;k,l)=q(i,k) and kq~(i,j;k,l)=q(j,l)\sum_{k}\tilde{q}(i,j;k,l)=q(j,l). And suppose that

q~(i,i;k,k)\displaystyle\tilde{q}(i,i;k,k) =\displaystyle= q(i,k) for all k\displaystyle q(i,k)\mbox{ for all }k
q~(i,i;k,l)\displaystyle\tilde{q}(i,i;k,l) =\displaystyle= 0 for lk.\displaystyle 0\mbox{ for }l\neq k.

Take (Xt(i),Xt(j))(X^{(i)}_{t},X^{(j)}_{t}) to be the chain on I×II\times I with transition rate matrix 𝐐~\tilde{{\bf Q}} and initial position (i,j)(i,j), Then (14.3) must hold, and Tmin{t:Xt(i)=Xt(j)}T\equiv\min\{t:X^{(i)}_{t}=X^{(j)}_{t}\} is a coupling time. This construction gives a Markov coupling, and all the examples where we use the coupling inequality will be of this form. In practice it is much more understandable to define the joint process in words

xxx red and black particles.

A particular choice of 𝐐~\tilde{{\bf Q}} is

q~(i,j;k,l)=q(i,k)q(j,l),  ji\tilde{q}(i,j;k,l)=q(i,k)q(j,l),\ j\neq i (14.7)

in which case the joint process is called to Doeblin coupling. In words, the Doeblin coupling consists of starting one particle at ii and the other particle at jj, and letting the two particles move independently until they meet, at time Mi,jM_{i,j} say, and thereafter letting them stick together. In the particular case of a birth-and-death process, the particles cannot cross without meeting (in continuous time), and so if i<ji<j then Xt(i)Xt(j)X^{(i)}_{t}\leq X^{(j)}_{t} for all tt, the property we used at (14.2).

14.1.2 Examples using the coupling inequality

Use of the coupling inequality has nothing to do with reversibility. In fact it finds more use in the irreversible setting, where fewer alternative methods are available for quantifying convergence to stationarity. In the reversible setting, coupling provides a quick way to get bounds which usually (but not always) can be improved by other methods. Here are two examples we have seen before.

Example 14.2

Random walk on the dd-cube (Chapter 5 Example yyy).

For 𝐢=(i1,,id){\bf i}=(i_{1},\ldots,i_{d}) and 𝐣=(j1,,jd){\bf j}=(j_{1},\ldots,j_{d}) in I={0,1}dI=\{0,1\}^{d}, let D(𝐢,𝐣)D({\bf i},{\bf j}) be the set of coordinates uu where 𝐢{\bf i} and 𝐣{\bf j} differ. Write 𝐢u{\bf i}^{u} for the state obtained by changing the ii’th coordinate of 𝐢{\bf i}. Recall that in continuous time the components move independently as 22-state chains with transition rates 1/d1/d. In words, the coupling is “run unmatched coordinates independently until they match, and then run them together”. Formally, the non-zero transitions of the joint process are

q~(𝐢,𝐣;𝐢u,𝐣u)\displaystyle\tilde{q}({\bf i},{\bf j};{\bf i}^{u},{\bf j}^{u}) =\displaystyle= 1/d if iu=ju\displaystyle 1/d\mbox{ if }i_{u}=j_{u}
q~(𝐢,𝐣;𝐢u,𝐣)\displaystyle\tilde{q}({\bf i},{\bf j};{\bf i}^{u},{\bf j}) =\displaystyle= 1/d if iuju\displaystyle 1/d\mbox{ if }i_{u}\neq j_{u}
q~(𝐢,𝐣;𝐢,𝐣u)\displaystyle\tilde{q}({\bf i},{\bf j};{\bf i},{\bf j}^{u}) =\displaystyle= 1/d if iuju.\displaystyle 1/d\mbox{ if }i_{u}\neq j_{u}.

For each coordinate which is initially unmatched, it takes exponential (rate 2/d2/d) time until it is matched, and so the coupling time TT satisfies

T=dmax(ξ1,,ξd0)T\ \stackrel{d}{=}\ \max(\xi_{1},\ldots,\xi_{d_{0}})

where the (ξu)(\xi_{u}) are independent exponential (rate 2/d2/d) and d0=d(𝐢,𝐣)d_{0}=d({\bf i},{\bf j}) is the initial number of unmatched coordinates. So

P(Tt)=(1-exp(-2t/d))d0P(T\leq t)=(1-\exp(-2t/d))^{d_{0}}

and the coupling inequality bounds variation distance as

d¯(t)(1-exp(-2t/d))d.\bar{d}(t)\leq(1-\exp(-2t/d))^{d}.

This leads to an upper bound on the variation threshold time

τ1(12+o(1))dlogd as d.\tau_{1}\leq(\frac{1}{2}+o(1))d\log d\mbox{ as }d\rightarrow\infty.

In this example we saw in Chapter 5 that in fact

τ114dlogd as d\tau_{1}\sim\frac{1}{4}d\log d\mbox{ as }d\rightarrow\infty

so the coupling bound is off by a factor of 22.

Example 14.3

Random walk on a dense regular graph (Chapter 5 Example yyy).

Consider a rr-regular nn-vertex graph. Write 𝒩(v)\mbox{${\cal N}$}(v) for the set of neighbors of vv. For any pair v,wv,w we can define a 1-11-1 map θv,w:𝒩(v)𝒩(w)\theta_{v,w}:\mbox{${\cal N}$}(v)\rightarrow\mbox{${\cal N}$}(w) such that θv,w(x)=x\theta_{v,w}(x)=x for x𝒩(v)𝒩(w)x\in\mbox{${\cal N}$}(v)\cap\mbox{${\cal N}$}(w). We can now define a “greedy coupling” by

q~(v,w;x,θv,w(x)=1/r,x𝒩(v).\tilde{q}(v,w;x,\theta_{v,w}(x)=1/r,\ \ x\in\mbox{${\cal N}$}(v).

In general one cannot get useful bounds on the coupling time TT. But consider the dense case, where r>n/2r>n/2. As observed in Chapter 5 Example yyy, here |𝒩(v)𝒩(w)|2r-n|\mbox{${\cal N}$}(v)\cap\mbox{${\cal N}$}(w)|\geq 2r-n and so the coupled processes (Xt,Yt)(X_{t},Y_{t}) have the property that for wvw\neq v

P(Xt+dt=Yt+dt|Xt=v,Yt=w)=|𝒩(v)𝒩(w)|rdt2r-nrdtP(X_{t+dt}=Y_{t+dt}|X_{t}=v,Y_{t}=w)=\frac{|\mbox{${\cal N}$}(v)\cap\mbox{${% \cal N}$}(w)|}{r}dt\geq\frac{2r-n}{r}dt

implying that TT satisfies

P(T>t)exp(-(2r-n)t/r).P(T>t)\leq\exp(-(2r-n)t/r).

So the coupling inequality implies d¯(t)exp(-(2r-n)t/r)\bar{d}(t)\leq\exp(-(2r-n)t/r), and in particular the variation threshold satisfies

τ1r2r-n.\tau_{1}\leq\frac{r}{2r-n}.

14.1.3 Comparisons via couplings

We now give two examples of coupling in the wide sense, to compare different processes. The first is a technical result (inequality (14.8) below) which we needed in Chapter 6 yyy. The second is the proof of Proposition 14.1(b).

Example 14.4

Exit times for constrained random walk.

Let (Xt)(X_{t}) be discrete-time random walk on a graph GG, let AA be a subset of the vertices of GG and let (Yt)(Y_{t}) be random walk on the subgraph induced by AA. Given BAB\subset A, let SS be the first hitting time of (Yt)(Y_{t}) on BB, and let TT be the first hitting time of (Xt)(X_{t}) on BAcB\cup A^{c}. Then

EiTEiS,  iA.E_{i}T\leq E_{i}S,\ i\in A. (14.8)

This is “obvious”, and the reason it’s obvious is by coupling. We can construct coupled processes (X,Y)(X^{\prime},Y^{\prime}) with the property that, if both particles are at the same position aa in AA, and if XX jumps to another state bb in AA, then YY jumps to the same state bb. This property immediately implies that, for the coupled processes started at the same state in AA, we have TST^{\prime}\leq S^{\prime} and hence (14.8).

In words, here is the coupling (X,Y)(X^{\prime},Y^{\prime}). When the particles are at different positions they jump independently. When they are at the same position, first let XX^{\prime} jump; if XX^{\prime} jumps to a vertex in AA let YY^{\prime} jump to the same vertex, and otherwise let YY^{\prime} jump to a uniform random neighbor in AA. Formally, the coupled process moves according to the transition matrix 𝐏~\tilde{{\bf P}} on G×AG\times A defined by

p~(x,a;y,b)=pG(x,y)pA(a,b) if xAx\not\in A or xax\neq a\tilde{p}(x,a;y,b)=p_{G}(x,y)\ p_{A}(a,b)\mbox{ if $x\not\in A$ or $x\neq a$}
p~(a,a;b,b)=pG(a,b),  bA\tilde{p}(a,a;b,b)=p_{G}(a,b),\ b\in A
p~(a,a;y,b)=pG(a,y)pA(a,b),  bA,yAc\tilde{p}(a,a;y,b)=p_{G}(a,y)p_{A}(a,b),\ b\in A,y\in A^{c}

where pAp_{A} and pGp_{G} refer to transition probabilities for the original random walks on AA and GG.

Proof of Proposition 14.1(b). Fix i1i\geq 1. By reversibility it is sufficient to prove

P0(Xt=i)P0(Xt=0) is non-decreasing in tt .\frac{P_{0}(X_{t}=i)}{P_{0}(X_{t}=0)}\mbox{ is non-decreasing in $t$ }.

Consider the Doeblin coupling (Xt(0),Xt(i))(X^{(0)}_{t},X^{(i)}_{t}) of the processes started at 00 and at ii, with coupling time TT. Since Xt(0)Xt(i)X^{(0)}_{t}\leq X^{(i)}_{t} we have

P(Xt(i)=0)=P(Xt(0)=0,Tt)P(X^{(i)}_{t}=0)=P(X^{(0)}_{t}=0,T\leq t)

and so we have to prove

P(Tt|Xt(0)=0) is non-decreasing in tt .P(T\leq t|X^{(0)}_{t}=0)\mbox{ is non-decreasing in $t$ }.

It suffices to show that, for t1>tt_{1}>t,

P(Tt|Xt1(0)=0)P(Tt|Xt(0)=0)P(T\leq t|X^{(0)}_{t_{1}}=0)\geq P(T\leq t|X^{(0)}_{t}=0)

and thus, by considering the conditional distribution of Xt(0)X^{(0)}_{t} given Xt1(0)=0X^{(0)}_{t_{1}}=0, it suffices to show that

P(Tt|Xt(0)=j)P(Tt|Xt(0)=0)P(T\leq t|X^{(0)}_{t}=j)\geq P(T\leq t|X^{(0)}_{t}=0) (14.9)

for j0j\geq 0. So fix jj and tt. Write (Xs(0,j),0st)(X^{(0,j)}_{s},0\leq s\leq t) for the process conditioned on X0=0,Xt=jX_{0}=0,X_{t}=j. By considering time running backwards from tt to 00, the processes X(0,0)X^{(0,0)} and X(0,j)X^{(0,j)} are the same non-homogeneous Markov chain started at the different states 00 and jj, and we can use the Doeblin coupling in this non-homogeneous setting to construct versions of these processes with

Xs(0,0)Xs(0,j), 0st.X^{(0,0)}_{s}\leq X^{(0,j)}_{s},\ 0\leq s\leq t.

Now introduce an independent copy of the original process, started at time 00 in state ii. If this process meets X(0,0)X^{(0,0)} before time tt then it must also meet X(0,j)X^{(0,j)} before time tt, establishing (14.9).