Consider an irreducible transition matrix on a finite state space . A symmetry of is a map such that
The set of symmetries forms a group under convolution, and in our (non-standard) terminology a symmetric Markov transition matrix is one for which acts transitively, i.e.
Such a chain need not be reversible; a symmetric reversible chain is just a chain which is both symmetric and reversible. A natural setting is where is itself a group under an operation which we write multiplicitively. If is a probability distribution on and are i.i.d. -valued with distribution then
(7.1) |
is the symmetric Markov chain with transition probabilities
started at . This chain is reversible iff
(7.2) |
We have rather painted ourselves into a corner over terminology. The usual terminology for the process (7.1) is “random walk on the group ” and if (7.2) holds then it is a
(7.3) |
Unfortunately in this phrase, both “symmetric” and “walk” conflict with our conventions, so we can’t use the phrase. Instead we will use “random flight on the group ” for a process (7.1), and “reversible random flight on the group ” when (7.2) also holds. Note that we always assume chains are irreducible, which in the case of a random flight holds iff the support of generates the whole group . Just keep in mind that the topic of this section, symmetric reversible chains, forms a minor generalization of the processes usually described by (7.3).
On an graph , a graph automorphism is a map such that
The graph is called vertex-transitive if the automorphism group acts transitively on vertices. Clearly, random walk on a (unweighted) graph is a symmetric reversible chain iff the graph is vertex-transitive. We specialize to this case in section 7.1.8. A further specialization is to random walk on a Cayley graph. If is a set of generators of a group , which we always assume to satisfy
then the associated Cayley graph has vertex-set and edge-set
A Cayley graph is vertex-transitive.
Finally, recall from Chapter 3 yyy that we can identify a reversible chain with a random walk on a weighted graph. With this identification, a symmetric reversible chain is one where the weighted graph is vertex-transitive, in the natural sense.
For an irreducible reversible chain, the following are equivalent.
(a)
(b) .
Proof. In either case the stationary distribution is uniform – under (a), by letting , and under (b) by taking , implying . So by reversibility for and . But recall from Chapter 2 Lemma yyy that the generating functions and satisfy
(7.4) |
For we have seen that , and hence by (7.4)
which is the assertion of Lemma 7.1.
Our standing assumption is that we have an irreducible symmetric reversible -state chain. The symmetry property implies that the stationary distribution is uniform, and also implies
(7.5) |
But by Chapter 3 Lemma yyy, under reversibility (7.5) is equivalent to
(7.6) |
And clearly (7.6) implies
(7.7) |
We make frequent use of these properties. Incidently, (7.7) is in general strictly weaker than (7.6): van Slijpe [330] p. 288 gives an example with a -state reversible chain.
We also have, from the definition of symmetric, that is constant in , and hence
(7.8) |
So by Chapter 4 yyy
(7.9) |
The formula for in terms of the fundamental matrix (Chapter 2 yyy) can be written as
(7.10) |
Approximating by the first few terms is what we call the local transience heuristic. See Chapter xxx for rigorous discussion.
(i) .
(ii)
Proof. (i) This is a specialization of Chapter 6 xxx.
(ii) For any ,
Averaging over , the right side becomes .
Recall that a simple Cauchy-Schwartz argument (Chapter 3 yyy) shows that, for any reversible chain whose stationary distribution is uniform,
So by (7.5), for a symmetric reversible chain, the most likely place to be after steps is where you started:
.
This type of result is nicer in continuous time, where the inequality holds for all times.
Here is our first non-trivial result, from Aldous [18].
Suppose a sequence of symmetric reversible chains satisfies . Then
(a) For the stationary chain, and for arbitrary , we have and , where has exponential(1) distribution.
(b) .
(c) If are such that then .
Note that, because and , the hypothesis “” is weaker than either “” or “”.
Part (a) is a specialization of Chapter 3 Proposition yyy and its proof. Parts (b) and (c) use refinements of the same technique. Part (b) implies
Because this applies in many settings in this Chapter, we shall rarely need to discuss further.
xxx give proof
In connection with (b), note that
(7.11) |
by definition of and vertex-transitivity. So (b) is obvious under the slightly stronger hypothesis .
Chapter 3 Proposition yyy actually gives information on hitting times to more general subsets of vertices. Because (Chapter 3 yyy) , we get (in continuous time) a quantification of the fact that has approximately exponential distribution when and when the chain starts with the uniform distribution:
Recall the cover time from Chapter 6. By symmetry, in our present setting doesn’t depend on the starting place , so we can write . In this section we combine results on hitting times with various forms of Matthews method to obtain asymptotics for cover times in the setting of a sequence of symmetric reversible chains. Experience, and the informal argument above (7.15), suggest the principle
(7.12) |
The results in this chapter concerning cover times go some way towards formalizing this principle.
For a sequence of symmetric reversible chains
(a) , where .
(b) If then .
(c) If for fixed then
Proof. Using the basic form of Matthews method (Chapter 2 yyy), (a) follows from Lemma 7.2 and (b) from Theorem 7.4. To prove (c), fix a state and . Using (7.11) and Markov’s inequality,
So we can inductively choose vertices such that
By the extended form of Matthews method (Chapter 6 Corollary yyy)
From Chapter 4 yyy, and so the hypothesis implies . So the asymptotic lower bound for becomes , and since is arbitrary the result follows.
Since the only natural examples with are variations of random walk on the -cycle, for which without the “” term, we expect a positive answer to
In the setting of Corollary 7.5, is without further hypotheses?
Here is an artificial example to illustrate the bound in (c).
Two time scales.
Take such that . The underlying idea is to take two continuous-time random walks on the complete graphs and , but with the walks run on different time scales. To set this up directly in discrete time, take state space and transition probabilities
where slowly. It is not hard to formalize the following analysis. Writing the chain as , the -component stays constant for time , during which time every -value is hit, because the cover time for is . And jumps of the -component are required to hit every -value, so
(7.13) |
Now , and because the mean number of returns to the starting point before the first -jump is we can use the local transience heuristic (7.10) to see . So , and the lower bound from (c) is
But this agrees with the exact limit (7.13), because .
We now turn to sharper distributional limits for . An (easy) background fact is that, for independent random variables with exponential, mean , distribution,
where has the extreme value distribution
(7.14) |
Now the cover time is the max of the hitting times, and with the uniform initial distribution the ’s have mean . So if the ’s have approximately exponential distribution and are roughly independent of each other then we anticipate the limit result
(7.15) |
Theorem 7.4 has already given us a condition for limit exponential distributions, and we shall build on this result to give (Theorem 7.9) conditions for (7.15) to hold.
The extreme value distribution (7.14) has transform
(7.16) |
Classical probability theory (see Notes) says that to prove (7.15) it is enough to show that transforms converge, i.e. to show
(7.17) |
But Matthews method, which previously we have used on expectations, can just as well be applied to transforms. By essentially the same argument as in Chapter 2 Theorem yyy, Matthews [258] obtained
The cover time in a not-necessarily-reversible Markov chain with arbitrary initial distribution satisfies
where
Substituting into (7.17), and using the fact
we see that to establish (7.15) it suffices to prove that for arbitrary and for each fixed ,
(7.18) |
For a sequence of symmetric reversible chains, if
(a)
(b)
then
Proof. By hypothesis (a) and Theorem 7.4 (b,c), for arbitrary we have . This implies (7.18) for , and also by Fatou’s lemma implies for . Thus it is sufficient to prove
(7.19) |
The proof exploits some of our earlier general inequalities. Switch to continuous time. Fix . By conditioning on the position at some fixed time ,
By Corollary 7.3 the max is attained by , and so
We now apply some general inequalities. Chapter 4 yyy says . Writing for the quasistationary distribution on , Chapter 3 (yyy) implies and hence
But Chapter 3 Theorem yyy implies . So setting , these inequalities combine to give
But by hypothesis (b) we can choose so that each of the first two terms in the bound tends to , establishing (7.19). Finally, the effect of continuization is to change by at most , so the asymptotics remain true in discrete time.
Remark. Presumably (c.f. Open Problem 7.6) the Theorem remains true without hypothesis (b).
In view of Chapter 6 yyy it is surprising that there is no obvious example to disprove
Let denote the last state to be hit. In a sequence of vertex-transitive graphs with , is it always true that converges (in variation distance, say) to the uniform distribution?
In our collection of examples in Chapter 5 of random walks on graphs, the examples with enough symmetry to fit into the present setting have in fact extra symmetry, enough to fit into the arc-transitive setting of section 7.2. So in a sense, working at the level of generality of symmetric reversible chains merely serves to illustrate what properties of chains depend only on this minimal level of symmetry. But let us point out a general construction. Suppose we have symmetric reversible chains on state spaces . Fix constants with each and with . Then (c.f. Chapter 4 section yyy) we can define a “product chain” with state-space and transition probabilities
This product chain is also symmetric reversible. But if the underlying chains have extra symmetry properties, these extra properties are typically lost when one passes to the product chain. Thus we have a general method of constructing symmetric reversible chains which lack extra structure. Example 7.14 below gives a case with distinct underlying components, and Example 7.11 gives a case with a non-uniform product. In general, writing for the continuous-time eigenvalues of , we have (Chapter 4 yyy) that the continuous-time eigenvalues of the product chain are
indexed by . So in particular
and of course these parameters take the same values in discrete time.
Coordinate-biased random walk on the -cube.
Take and fix with . Then the chain with transitions
is the weighted product of two-state chains. Most of the calculations for simple symmetric random walk on the -cube done in Chapter 5 Example yyy extend to this example, with some increase of complexity. In particular,
In continuout time we still get the product form for the distribution at time :
So in a sequence of continuous time chains with , the “separation” parameter of Chapter 3 section yyy is asymptotic to the solution of
More elaborate calculations can be done to study and the discrete-time version.
Chapter 2 yyy and Chapter 4 yyy discussed quantifications of notions of “time to approach stationarity” using variation distance. The emphasis in Chapter 4 yyy was on inequalities which hold up to universal constants. In the present context of symmetric reversible chains, one can seek to do sharper calculations. Thus for random walk on the -cube (Chapter 5 Example yyy), with chances of making each possible step or staying still, writing and , we have (as ) not only the fact but also the stronger result
(7.20) |
We call this the cutoff phenomenon, and when a sequence of chains satisfies (7.20) we say the sequence has “variation cutoff at ”. As mentioned at xxx, the general theory of Chapter 4 works smoothly using , but in examples it is more natural to use , which we shall do in this chapter. Clearly, (7.20) implies the same result for and implies . Also, our convention in this chapter is to work in discrete time, whereas the Chapter 4 general theory worked more smoothly in continuous time. (Clearly (7.20) in discrete time implies the same result for the continuized chains, provided ). Note that, in the context of symmetric reversible chains,
We also can discuss separation distance (Chapter 4 yyy) which in this context is
and introduce the analogous notion of separation threshold.
It turns out that these cut-offs automatically appear in sequences of chains defined by repeated products. An argument similar to the analysis of the -cube (see [8] for a slightly different version) shows
Fix an aperiodic symmetric reversible chain with states and with relaxation time . Consider the -fold product chain with states and transition probabilities
As , this sequence of chains has variation cutoff and separation cut-off .
xxx discuss upper bound lemma
xxx heuristics
xxx mention later examples
So far we have worked in the setting of symmetric reversible chains, and haven’t used any graph theory. We now specialize to the case of random walk on a vertex-transitive or Cayley graph . As usual, we won’t write out all specializations of the previous results, but instead emphasize what extra we get from graph-theoretic arguments. Let be the degree of the graph.
For random walk on a vertex-transitive graph,
(i)
(ii)
Proof. The lower bounds are specializations of Lemma 7.2(i), i.e. of Chapter 6 xxx. For the upper bound in (ii),
(7.21) | |||||
Rearrange.
xxx mention general lower bound via tree-cover.
It is known (xxx ref) that a Cayley graph of degree is -edge-connected, and so Chapter 6 Proposition yyy gives
where .
A Cayley graph where is not the same for all edges .
Consider with generators . The figure illustrates the case .
Let’s calculate using the resistance interpretation. Put unit voltage at and zero voltage at , and let be the voltage at . By symmetry the voltage at is , so we get the equations
with . But this is just a linear difference equation, and a brief calculation gives the solution
where . The current flow is , so the effective resistance is . The commute interpretation of resistance gives , and so
where is the number of vertices. In particular,
Using the averaging property (7.21)
Turning from hitting times to mixing times, recall the Cheeger constant
where is a proper subset of vertices and
For random walk on a Cayley graph one can use simple “averaging” ideas to bound . This is Proposition 7.15 below. The result in fact extends to vertex-transitive graphs by a covering graph argument - see xxx.
Consider a -vertex Cayley graph with degree and generators , where implies . Then
where . Lower bounding the sum by its maximal term, we get
(7.22) |
On a Cayley graph of degree
(i) , where is the diameter of the graph.
(ii) for all with , where
is the radius of .
Note that is bounded by but not in general by (consider the cycle), so that (ii) implies (i) with an extra factor of . Part (i) is from Aldous [16] and (ii) is from Babai [36].
Proof. (i) Fix . Because
there exists some such that , implying
(7.23) |
We can write for some sequence of generators and some , and
So there exists with , and so (i) follows from (7.22). For part (ii), fix with , write and suppose
(7.24) |
Fix with . Since and
we have by induction
(7.25) |
Write for the ball of radius about . Since , inequality (7.25) shows that is non-empty for each , and so . But by definition of we have , implying , which in turn implies is the whole group. Now (7.25) implies that for every
But this contradicts (7.23). So (7.24) is false, i.e.
By complementation the final inequality remains true when , and the result follows from (7.22).
The “distinguished paths” method of bounding relaxation times (Chapter 4 yyy) can also be used to compare relaxation times of two random flights on the same group, and hence to bound one “unknown” relaxation time in terms of a second “known” relaxation time. This approach has been developed in great depth in
xxx ref Diaconis Saloff-Coste papers.
Here we give only the simplest of their results, from [115].
Consider generators of a group , and consider a reversible random flight with step-distribution supported on . Write for the distance from to the identity in the Cayley graph, i.e. the minimal length of a word
For each choose some minimal-length word as above and define to be the number of occurences of in the word. Now consider a different reversible random flight on with some step-distribution , not necessarily supported on . If we know , the next result allows us to bound .
xxx give proof – tie up with discussion
Perhaps surprisingly, Theorem 7.16 gives information even when the comparison walk is the “trivial” walk whose step-distribution is uniform on the group. In this case, both and are bounded by the diameter , giving
For reversible flight with step-distribution on a group ,
where is the support of and is the diameter of the Cayley graph associated with .
When is uniform on and , the Corollary gives the bound , which improves on the bound which follows from Proposition 7.15 and Cheeger’s inequality (Chapter 4 yyy). The examples of the torus show that enters naturally, but one could hope for the following variation.
Write for the minimum of over all symmetric random flights on with step-distribution supported on . Is it true that ?