If and are random variables with Binomial and distributions respectively, and if , then it is intuitively obvious that
(14.1) |
One could verify this from the exact formulas, but there is a more elegant non-computational proof. For define events , independent as varies, with . And define
number of A’s which occur | ||||
number of A’s and B’s which occur. |
Then , so (14.1) holds for and , but then because and we have proved that (14.1) holds for and . This is the prototype of a coupling argument, which (in its wide sense) means
to prove some distributional inequality relating two random processes by constructing versions which satisfy some sample path inequality.
Our first “process” example is a somewhat analogous proof of part (a) of the following result, which abstracts slightly a result stated for random walk on distance-regular graphs (Chapter 7 Proposition yyy).
Let be an irreducible continuous-time birth-and-death chain on states .
(a)
(b)
Proof. Fix . Suppose we can construct processes and , distributed as the given chain started at and respectively, such that
(14.2) |
Then
But by reversibility
and similarly for , establishing (a).
Existence of processes satisfying (14.2) is a consequence of the Doeblin coupling discussed below. The proof of part (b) involves a different technique and is deferred to section 14.1.3.
Consider a finite-state chain in discrete or continuous time. Fix states . Suppose we construct a joint process such that
is distributed as the chain started at | |||
(14.3) |
And suppose there is a random time such that
(14.4) |
Call such a a coupling time. Then the coupling inequality is
(14.5) |
The inequality is clear once we observe . The coupling inequality provides a method of bounding the variation distance of Chapter 2 section yyy.
The most common strategy for constructing a coupling satisfying (14.3) is via Markov couplings, as follows. Suppose the underlying chain has state space and (to take the continuous-time case) transition rate matrix . Consider a transition rate matrix on the product space . Write the entries of as instead of the logical-but-fussy . Suppose that, for each pair with ,
(14.6) |
in other words and . And suppose that
Take to be the chain on with transition rate matrix and initial position , Then (14.3) must hold, and is a coupling time. This construction gives a Markov coupling, and all the examples where we use the coupling inequality will be of this form. In practice it is much more understandable to define the joint process in words
xxx red and black particles.
A particular choice of is
(14.7) |
in which case the joint process is called to Doeblin coupling. In words, the Doeblin coupling consists of starting one particle at and the other particle at , and letting the two particles move independently until they meet, at time say, and thereafter letting them stick together. In the particular case of a birth-and-death process, the particles cannot cross without meeting (in continuous time), and so if then for all , the property we used at (14.2).
Use of the coupling inequality has nothing to do with reversibility. In fact it finds more use in the irreversible setting, where fewer alternative methods are available for quantifying convergence to stationarity. In the reversible setting, coupling provides a quick way to get bounds which usually (but not always) can be improved by other methods. Here are two examples we have seen before.
Random walk on the -cube (Chapter 5 Example yyy).
For and in , let be the set of coordinates where and differ. Write for the state obtained by changing the ’th coordinate of . Recall that in continuous time the components move independently as -state chains with transition rates . In words, the coupling is “run unmatched coordinates independently until they match, and then run them together”. Formally, the non-zero transitions of the joint process are
For each coordinate which is initially unmatched, it takes exponential (rate ) time until it is matched, and so the coupling time satisfies
where the are independent exponential (rate ) and is the initial number of unmatched coordinates. So
and the coupling inequality bounds variation distance as
This leads to an upper bound on the variation threshold time
In this example we saw in Chapter 5 that in fact
so the coupling bound is off by a factor of .
Random walk on a dense regular graph (Chapter 5 Example yyy).
Consider a -regular -vertex graph. Write for the set of neighbors of . For any pair we can define a map such that for . We can now define a “greedy coupling” by
In general one cannot get useful bounds on the coupling time . But consider the dense case, where . As observed in Chapter 5 Example yyy, here and so the coupled processes have the property that for
implying that satisfies
So the coupling inequality implies , and in particular the variation threshold satisfies
We now give two examples of coupling in the wide sense, to compare different processes. The first is a technical result (inequality (14.8) below) which we needed in Chapter 6 yyy. The second is the proof of Proposition 14.1(b).
Exit times for constrained random walk.
Let be discrete-time random walk on a graph , let be a subset of the vertices of and let be random walk on the subgraph induced by . Given , let be the first hitting time of on , and let be the first hitting time of on . Then
(14.8) |
This is “obvious”, and the reason it’s obvious is by coupling. We can construct coupled processes with the property that, if both particles are at the same position in , and if jumps to another state in , then jumps to the same state . This property immediately implies that, for the coupled processes started at the same state in , we have and hence (14.8).
In words, here is the coupling . When the particles are at different positions they jump independently. When they are at the same position, first let jump; if jumps to a vertex in let jump to the same vertex, and otherwise let jump to a uniform random neighbor in . Formally, the coupled process moves according to the transition matrix on defined by
where and refer to transition probabilities for the original random walks on and .
Proof of Proposition 14.1(b). Fix . By reversibility it is sufficient to prove
Consider the Doeblin coupling of the processes started at and at , with coupling time . Since we have
and so we have to prove
It suffices to show that, for ,
and thus, by considering the conditional distribution of given , it suffices to show that
(14.9) |
for . So fix and . Write for the process conditioned on . By considering time running backwards from to , the processes and are the same non-homogeneous Markov chain started at the different states and , and we can use the Doeblin coupling in this non-homogeneous setting to construct versions of these processes with
Now introduce an independent copy of the original process, started at time in state . If this process meets before time then it must also meet before time , establishing (14.9).