# 7.1 Symmetric reversible chains

## 7.1.1 Definitions

Consider an irreducible transition matrix ${\bf P}=(p_{ij})$ on a finite state space $I$. A symmetry of ${\bf P}$ is a $1-1$ map $\gamma:I\rightarrow I$ such that

 $p_{\gamma(i)\gamma(j)}=p_{ij}\mbox{ for all }i,j.$

The set $\Gamma$ of symmetries forms a group under convolution, and in our (non-standard) terminology a symmetric Markov transition matrix is one for which $\Gamma$ acts transitively, i.e.

 $\mbox{for all }i,j\in I\mbox{ there exists }\gamma\in\Gamma\mbox{ such that }% \gamma(i)=j.$

Such a chain need not be reversible; a symmetric reversible chain is just a chain which is both symmetric and reversible. A natural setting is where $I$ is itself a group under an operation $(i,j)\rightarrow ij$ which we write multiplicitively. If $\mu$ is a probability distribution on $I$ and $(Z_{t};t\geq 1)$ are i.i.d. $I$-valued with distribution $\mu$ then

 $X_{t}=x_{0}Z_{1}Z_{2}\ldots Z_{t}$ (7.1)

is the symmetric Markov chain with transition probabilities

 $p_{ij}=\mu(i^{-1}*j)$

started at $x_{0}$. This chain is reversible iff

 $\mu(i)=\mu(i^{-1})\mbox{ for all }i.$ (7.2)

We have rather painted ourselves into a corner over terminology. The usual terminology for the process (7.1) is “random walk on the group $I$” and if (7.2) holds then it is a

 $I$ (7.3)

Unfortunately in this phrase, both “symmetric” and “walk” conflict with our conventions, so we can’t use the phrase. Instead we will use “random flight on the group $I$” for a process (7.1), and “reversible random flight on the group $I$” when (7.2) also holds. Note that we always assume chains are irreducible, which in the case of a random flight holds iff the support of $\mu$ generates the whole group $I$. Just keep in mind that the topic of this section, symmetric reversible chains, forms a minor generalization of the processes usually described by (7.3).

On an graph $(\mbox{{\cal V}},\mbox{{\cal E}})$, a graph automorphism is a $1-1$ map $\gamma:\mbox{{\cal V}}\rightarrow\mbox{{\cal V}}$ such that

 ${\cal E}$

The graph is called vertex-transitive if the automorphism group acts transitively on vertices. Clearly, random walk on a (unweighted) graph is a symmetric reversible chain iff the graph is vertex-transitive. We specialize to this case in section 7.1.8. A further specialization is to random walk on a Cayley graph. If $\mbox{{\cal G}}=(g_{i})$ is a set of generators of a group $I$, which we always assume to satisfy

 ${\cal G}$

then the associated Cayley graph has vertex-set $I$ and edge-set

 $\{(v,vg):v\in I,g\in\mbox{{\cal G}}\}.$

A Cayley graph is vertex-transitive.

Finally, recall from Chapter 3 yyy that we can identify a reversible chain with a random walk on a weighted graph. With this identification, a symmetric reversible chain is one where the weighted graph is vertex-transitive, in the natural sense.

## 7.1.2 This section goes into Chapter 3

###### Lemma 7.1

For an irreducible reversible chain, the following are equivalent.

(a) $P_{i}(X_{t}=i)=P_{j}(X_{t}=j),i,j\in I,t\geq 1$

(b) $P_{i}(T_{j}=t)=P_{j}(T_{i}=t),i,j\in I,t\geq 1$.

Proof. In either case the stationary distribution is uniform – under (a), by letting $t\rightarrow\infty$, and under (b) by taking $t=1$, implying $p_{ij}\equiv p_{ji}$. So by reversibility $P_{i}(X_{t}=j)=P_{j}(X_{t}=i)$ for $i\neq j$ and $t\geq 1$. But recall from Chapter 2 Lemma yyy that the generating functions $G_{ij}(z)=\sum_{t}P_{i}(X_{t}=j)z^{j}$ and $F_{ij}(z)=\sum_{t}P_{i}(T_{t}=j)z^{j}$ satisfy

 $F_{ij}=G_{ij}/G_{jj}.$ (7.4)

For $i\neq j$ we have seen that $G_{ij}=G_{ji}$, and hence by (7.4)

 $F_{ij}=F_{ji}\mbox{ iff }G_{jj}=G_{ii},$

which is the assertion of Lemma 7.1.

## 7.1.3 Elementary properties

Our standing assumption is that we have an irreducible symmetric reversible $n$-state chain. The symmetry property implies that the stationary distribution $\pi$ is uniform, and also implies

 $P_{i}(X_{t}=i)=P_{j}(X_{t}=j),i,j\in I,t\geq 1.$ (7.5)

But by Chapter 3 Lemma yyy, under reversibility (7.5) is equivalent to

 $P_{i}(T_{j}=t)=P_{j}(T_{i}=t),i,j\in I,t\geq 1.$ (7.6)

And clearly (7.6) implies

 $E_{i}T_{j}=E_{j}T_{i}\mbox{ for all }i,j.$ (7.7)

We make frequent use of these properties. Incidently, (7.7) is in general strictly weaker than (7.6): van Slijpe [330] p. 288 gives an example with a $3$-state reversible chain.

We also have, from the definition of symmetric, that $E_{\pi}T_{i}$ is constant in $i$, and hence

 $E_{\pi}T_{i}=\tau_{0}\mbox{ for all }i.$ (7.8)

So by Chapter 4 yyy

 $\tau^{*}\leq 4\tau_{0}.$ (7.9)

The formula for $E_{\pi}T_{i}$ in terms of the fundamental matrix (Chapter 2 yyy) can be written as

 $\tau_{0}/n=1+\sum_{t=1}^{\infty}(P_{i}(X_{t}=i)-1/n).$ (7.10)

Approximating $\tau_{0}$ by the first few terms is what we call the local transience heuristic. See Chapter xxx for rigorous discussion.

###### Lemma 7.2

(i) $E_{i}T_{j}\geq\frac{n}{1+p(i,j)},\ j\neq i$.

(ii) $\max_{i,j}E_{i}T_{j}\leq 2\tau_{0}$

Proof. (i) This is a specialization of Chapter 6 xxx.

(ii) For any $i,j,k$,

 $E_{i}T_{j}\leq E_{i}T_{k}+E_{k}T_{j}=E_{i}T_{k}+E_{j}T_{k}.$

Averaging over $k$, the right side becomes $2\tau_{0}$.

Recall that a simple Cauchy-Schwartz argument (Chapter 3 yyy) shows that, for any reversible chain whose stationary distribution is uniform,

 $P_{i}(X_{2t}=j)\leq\sqrt{P_{i}(X_{2t}=i)P_{j}(X_{2t}=j)}.$

So by (7.5), for a symmetric reversible chain, the most likely place to be after $2t$ steps is where you started:

###### Corollary 7.3

$P_{i}(X_{2t}=j)\leq P_{i}(X_{2t}=i),\mbox{ for all }i,j,\in I,t\geq 1$.

This type of result is nicer in continuous time, where the inequality holds for all times.

## 7.1.4 Hitting times

Here is our first non-trivial result, from Aldous [18].

###### Theorem 7.4

Suppose a sequence of symmetric reversible chains satisfies $\tau_{2}/\tau_{0}\rightarrow 0$. Then

(a) For the stationary chain, and for arbitrary $j$, we have $T_{j}/\tau_{0}\ \stackrel{d}{\rightarrow}\ \xi$ and ${\rm var}\ (T_{j}/\tau_{0})\rightarrow 1$, where $\xi$ has exponential(1) distribution.

(b) $\max_{i,j}E_{i}T_{j}/\tau_{0}\rightarrow 1$.

(c) If $(i_{n},j_{n})$ are such that $E_{i_{n}}T_{j_{n}}/\tau_{0}\rightarrow 1$ then $P_{i_{n}}(T_{j_{n}}/\tau_{0}\in\cdot)\ \stackrel{d}{\rightarrow}\ \xi$.

Note that, because $\tau_{2}\leq\tau_{1}+1$ and $\tau_{0}\geq(n-1)^{2}/n$, the hypothesis “$\tau_{2}/\tau_{0}\rightarrow 0$” is weaker than either “$\tau_{2}/n\rightarrow 0$” or “$\tau_{1}/\tau_{0}\rightarrow 0$”.

Part (a) is a specialization of Chapter 3 Proposition yyy and its proof. Parts (b) and (c) use refinements of the same technique. Part (b) implies

 $\mbox{ if }\tau_{2}/\tau_{0}\rightarrow 0\mbox{ then }\tau^{*}\sim 2\tau_{0}.$

Because this applies in many settings in this Chapter, we shall rarely need to discuss $\tau^{*}$ further.

xxx give proof

In connection with (b), note that

 $E_{v}T_{w}\leq\tau_{1}^{(2)}+\tau_{0}$ (7.11)

by definition of $\tau_{1}^{(2)}$ and vertex-transitivity. So (b) is obvious under the slightly stronger hypothesis $\tau_{1}/\tau_{0}\rightarrow 0$.

Chapter 3 Proposition yyy actually gives information on hitting times $T_{A}$ to more general subsets $A$ of vertices. Because (Chapter 3 yyy) $E_{\pi}T_{A}\geq\frac{(1-\pi(A))^{2}}{\pi(A)}$, we get (in continuous time) a quantification of the fact that $T_{A}$ has approximately exponential distribution when $|A|\ll n/\tau_{2}$ and when the chain starts with the uniform distribution:

 $\sup_{t}|P_{\pi}(T_{A}>t)-\exp(-t/E_{\pi}T_{A})|\leq\frac{\tau_{2}n}{|A|}\ % \left(1-\frac{|A|}{n}\right)^{-2}.$

## 7.1.5 Cover times

Recall the cover time $C$ from Chapter 6. By symmetry, in our present setting $E_{i}C$ doesn’t depend on the starting place $i$, so we can write $EC$. In this section we combine results on hitting times with various forms of Matthews method to obtain asymptotics for cover times in the setting of a sequence of symmetric reversible chains. Experience, and the informal argument above (7.15), suggest the principle

 $n$ (7.12)

The results in this chapter concerning cover times go some way towards formalizing this principle.

###### Corollary 7.5

For a sequence of symmetric reversible chains

(a) $\frac{1-o(1)}{1+p^{*}}\ n\log n\leq EC\leq(2+o(1))\tau_{0}\log n$, where $p^{*}\equiv\max_{j\neq i}p_{i,j}$.

(b) If $\tau_{2}/\tau_{0}\rightarrow 0$ then $EC\leq(1+o(1))\tau_{0}\log n$.

(c) If $\tau_{2}/\tau_{0}=O(n^{-\beta})$ for fixed $0<\beta<1$ then

 $EC\geq(\beta-o(1))\tau_{0}\log n.$

Proof. Using the basic form of Matthews method (Chapter 2 yyy), (a) follows from Lemma 7.2 and (b) from Theorem 7.4. To prove (c), fix a state $j$ and $\varepsilon>0$. Using (7.11) and Markov’s inequality,

 $\pi\{i:E_{i}T_{j}\leq(1-\varepsilon)\tau_{0}\}\leq\frac{\tau_{1}^{(2)}}{% \varepsilon\tau_{0}}\equiv\alpha,\mbox{ say.}$

So we can inductively choose $\lceil\alpha^{-1}\rceil$ vertices $i_{k}$ such that

 $E_{i_{k}}T_{i_{l}}>(1-\varepsilon)\tau_{0};\ 1\leq k

By the extended form of Matthews method (Chapter 6 Corollary yyy)

 $EC\geq(1-\varepsilon)\tau_{0}h_{\lceil\alpha^{-1}\rceil-1}.$

From Chapter 4 yyy, $\tau_{1}\leq\tau_{2}(1+\log n)$ and so the hypothesis implies $\tau_{1}/\tau_{0}=O(n^{\varepsilon-\beta})$. So the asymptotic lower bound for $EC$ becomes $(1-\varepsilon)\tau_{0}(\beta-\varepsilon)\log n$, and since $\varepsilon$ is arbitrary the result follows.

Since the only natural examples with $\tau_{1}/\tau_{0}\not\rightarrow 0$ are variations of random walk on the $n$-cycle, for which $EC=\Theta(\tau_{0})$ without the “$\log n$” term, we expect a positive answer to

###### Open Problem 7.6

In the setting of Corollary 7.5, is $EC\leq(1+o(1))\tau_{0}\log n$ without further hypotheses?

Here is an artificial example to illustrate the bound in (c).

###### Example 7.7

Two time scales.

Take $m_{1}=m_{1}(n),m_{2}=m_{2}(n)$ such that $m_{1}\sim n^{1-\beta},m_{1}m_{2}\sim n$. The underlying idea is to take two continuous-time random walks on the complete graphs $K_{m_{1}}$ and $K_{m_{2}}$, but with the walks run on different time scales. To set this up directly in discrete time, take state space $\{(x,y):1\leq x\leq m_{1},1\leq y\leq m_{2}\}$ and transition probabilities

 $\displaystyle(x,y)$ $\displaystyle\rightarrow(x^{\prime},y)$ $\displaystyle\mbox{ chance }(m_{1}-1)^{-1}\left(1-\frac{1}{am_{1}\log m_{1}}% \right),\ x^{\prime}\neq x$ $\displaystyle\rightarrow(x,y^{\prime})$ $\displaystyle\mbox{ chance }(m_{2}-1)^{-1}\frac{1}{am_{1}\log m_{1}},\ y^{% \prime}\neq y$

where $a=a(n)\uparrow\infty$ slowly. It is not hard to formalize the following analysis. Writing the chain as $(X_{t},Y_{t})$, the $Y$-component stays constant for time $\Theta(am_{1}\log m_{1})$, during which time every $x$-value is hit, because the cover time for $K_{m_{1}}$ is $\sim m_{1}\log m_{1}$. And $m_{2}\log m_{2}$ jumps of the $Y$-component are required to hit every $y$-value, so

 $EC\sim(m_{2}\log m_{2})\times(am_{1}\log m_{1})\sim an(\log m_{1})(\log m_{2}).$ (7.13)

Now $\tau_{2}\sim am_{1}\log m_{1}$, and because the mean number of returns to the starting point before the first $Y$-jump is $\sim a\log m_{1}$ we can use the local transience heuristic (7.10) to see $\tau_{0}\sim(a\log m_{1})\times n$. So $\tau_{2}/\tau_{0}\sim m_{1}/n\sim n^{-\beta}$, and the lower bound from (c) is

 $(\beta-o(1))(a\log m_{1})n\log n.$

But this agrees with the exact limit (7.13), because $m_{2}\sim n^{\beta}$.

We now turn to sharper distributional limits for $C$. An (easy) background fact is that, for independent random variables $(Z_{i})$ with exponential, mean $\tau$, distribution,

 $\frac{\max(Z_{1},\ldots,Z_{n})-\tau\log n}{\tau}\ \stackrel{d}{\rightarrow}\ \eta$

where $\eta$ has the extreme value distribution

 $P(\eta\leq x)=\exp(-e^{-x}),\ -\infty (7.14)

Now the cover time $C=\max_{i}T_{i}$ is the max of the hitting times, and with the uniform initial distribution the $T_{i}$’s have mean $\tau_{0}$. So if the $T_{i}$’s have approximately exponential distribution and are roughly independent of each other then we anticipate the limit result

 $\frac{C-\tau_{0}\log n}{\tau_{0}}\ \stackrel{d}{\rightarrow}\ \eta.$ (7.15)

Theorem 7.4 has already given us a condition for limit exponential distributions, and we shall build on this result to give (Theorem 7.9) conditions for (7.15) to hold.

The extreme value distribution (7.14) has transform

 $E\exp(\theta\eta)=\Gamma(1-\theta),\ -\infty<\theta<1.$ (7.16)

Classical probability theory (see Notes) says that to prove (7.15) it is enough to show that transforms converge, i.e. to show

 $E\exp(\theta C/\tau_{0})\sim n^{-\theta}\Gamma(1-\theta),\ -\infty<\theta<1.$ (7.17)

But Matthews method, which previously we have used on expectations, can just as well be applied to transforms. By essentially the same argument as in Chapter 2 Theorem yyy, Matthews [258] obtained

###### Proposition 7.8

The cover time $C$ in a not-necessarily-reversible Markov chain with arbitrary initial distribution satisfies

 $\frac{\Gamma(n+1)\Gamma(1/f_{*}(\beta))}{\Gamma(n+1/f_{*}(\beta))}\leq E\exp(% \beta C)\leq\frac{\Gamma(n+1)\Gamma(1/f^{*}(\beta))}{\Gamma(n+1/f^{*}(\beta))}$

where

 $\displaystyle f^{*}(\beta)\equiv\max_{j\neq i}E_{i}\exp(\beta T_{j})$ $\displaystyle f_{*}(\beta)\equiv\min_{j\neq i}E_{i}\exp(\beta T_{j}).$

Substituting into (7.17), and using the fact

 $\frac{\Gamma(n+1)}{\Gamma(n+1-s_{n})}\sim n^{s}\mbox{ as }n\rightarrow\infty,% \ s_{n}\rightarrow s$

we see that to establish (7.15) it suffices to prove that for arbitrary $j_{n}\neq i_{n}$ and for each fixed $-\infty<\theta<1$,

 $E_{i_{n}}\exp(\theta T_{j_{n}}/\tau_{0})\rightarrow\frac{1}{1-\theta}.$ (7.18)
###### Theorem 7.9

For a sequence of symmetric reversible chains, if

(a) $\min_{j\neq i}E_{i}T_{j}=\tau_{0}(1-o(1))$

(b) $\tau_{2}/\tau_{0}=o\left(\frac{1}{\log n}\right)$

then

 $\frac{C-\tau_{0}\log n}{\tau_{0}}\ \stackrel{d}{\rightarrow}\ \eta.$

Proof. By hypothesis (a) and Theorem 7.4 (b,c), for arbitrary $j_{n}\neq i_{n}$ we have $P_{i_{n}}(T_{j_{n}}/\tau_{0}\in\cdot)\ \stackrel{d}{\rightarrow}\ \xi$. This implies (7.18) for $\theta\leq 0$, and also by Fatou’s lemma implies $\liminf_{n}E_{i_{n}}\exp(\theta T_{j_{n}}/\tau_{0})\geq\frac{1}{1-\theta}$for $0<\theta<1$. Thus it is sufficient to prove

 $\max_{j\neq i}E_{i}\exp(\theta T_{j}/\tau_{0})\leq\frac{1+o(1)}{1-\theta},\ 0<% \theta<1.$ (7.19)

The proof exploits some of our earlier general inequalities. Switch to continuous time. Fix $\beta>0$. By conditioning on the position at some fixed time $s$,

 $E_{i}\exp(\beta(T_{j}-s)^{+})\leq\max_{x}(nP_{i}(X_{s}=x))\times E_{\pi}\exp(% \beta T_{j}).$

By Corollary 7.3 the max is attained by $x=i$, and so

 $E_{i}\exp(\beta T_{j})\leq(nP_{i}(X_{s}=i)e^{\beta s})\times E_{\pi}\exp(\beta T% _{j}).$

We now apply some general inequalities. Chapter 4 yyy says $nP_{i}(X_{s}=i)\leq 1+n\exp(-s/\tau_{2})$. Writing $\alpha_{j}$ for the quasistationary distribution on $\{j\}^{c}$, Chapter 3 (yyy) implies $P_{\pi}(T_{j}>t)\leq\exp(-t/E_{\alpha_{j}}T_{j})$ and hence

 $E_{\pi}\exp(\beta T_{j})\leq\frac{1}{1-\beta E_{\alpha_{j}}T_{j}}.$

But Chapter 3 Theorem yyy implies $E_{\alpha_{j}}T_{j}\leq\tau_{0}+\tau_{2}$. So setting $\beta=\theta/\tau_{0}$, these inequalities combine to give

 $E_{i}\exp(\theta T_{j}/\tau_{0})\leq(1+n\exp(-s/\tau_{2}))\times\exp(\theta s/% \tau_{0})\times\frac{1}{1-\theta(1+\tau_{2}/\tau_{0})}.$

But by hypothesis (b) we can choose $s=o(\tau_{0})=\Omega(\tau_{2}\log n)$ so that each of the first two terms in the bound tends to $1$, establishing (7.19). Finally, the effect of continuization is to change $C$ by at most $O(\sqrt{EC})$, so the asymptotics remain true in discrete time.

Remark. Presumably (c.f. Open Problem 7.6) the Theorem remains true without hypothesis (b).

In view of Chapter 6 yyy it is surprising that there is no obvious example to disprove

###### Open Problem 7.10

Let $V$ denote the last state to be hit. In a sequence of vertex-transitive graphs with $n\rightarrow\infty$, is it always true that $V$ converges (in variation distance, say) to the uniform distribution?

## 7.1.6 Product chains

In our collection of examples in Chapter 5 of random walks on graphs, the examples with enough symmetry to fit into the present setting have in fact extra symmetry, enough to fit into the arc-transitive setting of section 7.2. So in a sense, working at the level of generality of symmetric reversible chains merely serves to illustrate what properties of chains depend only on this minimal level of symmetry. But let us point out a general construction. Suppose we have symmetric reversible chains $X^{(1)},\ldots,X^{(d)}$ on state spaces $I^{(1)},\ldots,I^{(d)}$. Fix constants $a_{1},\ldots,a_{d}$ with each $a_{i}>0$ and with $\sum_{i}a_{i}=1$. Then (c.f. Chapter 4 section yyy) we can define a “product chain” with state-space $I^{(1)}\times\ldots\times I^{(d)}$ and transition probabilities

 $(x_{1},\ldots,x_{d})\rightarrow(x_{1},\ldots,x^{\prime}_{i},\ldots,x_{d})\mbox% {: probability }a_{i}P(X^{(i)}_{1}=x^{\prime}_{i}|X^{(i)}_{0}=x_{i}).$

This product chain is also symmetric reversible. But if the underlying chains have extra symmetry properties, these extra properties are typically lost when one passes to the product chain. Thus we have a general method of constructing symmetric reversible chains which lack extra structure. Example 7.14 below gives a case with distinct underlying components, and Example 7.11 gives a case with a non-uniform product. In general, writing $(\lambda^{(i)}_{u}:1\leq u\leq|I^{(i)}|)$ for the continuous-time eigenvalues of $X^{(i)}$, we have (Chapter 4 yyy) that the continuous-time eigenvalues of the product chain are

 $\lambda_{{\bf u}}=a_{1}\lambda^{(1)}_{u_{1}}+\ldots+a_{d}\lambda^{(d)}_{u_{d}}$

indexed by ${\bf u}=(u_{1},\ldots,u_{d})\in\{1,\ldots,|I^{(1)}|\}\times\ldots\times\{1,% \ldots,|I^{(d)}|\}$. So in particular

 $\tau_{2}=\max_{i}\frac{\tau_{2}^{(i)}}{a_{i}}$
 $\tau_{0}=\sum_{{\bf u}\neq(1,\ldots,1)}\frac{1}{a_{1}\lambda^{(1)}_{u_{1}}+% \ldots+a_{d}\lambda^{(d)}_{u_{d}}}$

and of course these parameters take the same values in discrete time.

###### Example 7.11

Coordinate-biased random walk on the $d$-cube.

Take $I=\{0,1\}^{d}$ and fix $0 with $\sum_{i}a_{i}=1$. Then the chain with transitions

 $(b_{1},\ldots,b_{d})\rightarrow(b_{1},\ldots,1-b_{i},\ldots,b_{d})\mbox{ : % probability }a_{i}$

is the weighted product of two-state chains. Most of the calculations for simple symmetric random walk on the $d$-cube done in Chapter 5 Example yyy extend to this example, with some increase of complexity. In particular,

 $\tau_{2}=\frac{1}{2a_{1}}$
 $\tau_{0}=\frac{1}{2}\sum_{{\bf u}\in I,{\bf u}\neq{\bf 0}}\frac{1}{\sum_{i=1}^% {d}u_{i}a_{i}}.$

In continuout time we still get the product form for the distribution at time $t$:

 $P_{b}(X_{t}=b^{\prime})=2^{-d}\prod_{i}(1+\eta_{i}\exp(-2a_{i}t))\mbox{ ; }% \eta_{i}=1\mbox{ if }b^{\prime}_{i}=b_{i},\ =0\mbox{ if not.}$

So in a sequence of continuous time chains with $d\rightarrow\infty$, the “separation” parameter $\tau^{(1)}_{1}$ of Chapter 3 section yyy is asymptotic to the solution $t$ of

 $\sum_{i}\exp(-2a_{i}t)=-\log(1-e^{-1}).$

More elaborate calculations can be done to study $\tau_{1}$ and the discrete-time version.

## 7.1.7 The cutoff phenomenon and the upper bound lemma

Chapter 2 yyy and Chapter 4 yyy discussed quantifications of notions of “time to approach stationarity” using variation distance. The emphasis in Chapter 4 yyy was on inequalities which hold up to universal constants. In the present context of symmetric reversible chains, one can seek to do sharper calculations. Thus for random walk on the $d$-cube (Chapter 5 Example yyy), with chances $1/(d+1)$ of making each possible step or staying still, writing $n=2^{d}$ and $c_{n}=\frac{1}{4}d\log d$, we have (as $n\rightarrow\infty$) not only the fact $\tau_{1}\sim c_{n}$ but also the stronger result

 $d((1+\varepsilon)c_{n})\rightarrow 0\mbox{ and }d((1-\varepsilon)c_{n})% \rightarrow 1\mbox{, for all }\varepsilon>0.$ (7.20)

We call this the cutoff phenomenon, and when a sequence of chains satisfies (7.20) we say the sequence has “variation cutoff at $c_{n}$”. As mentioned at xxx, the general theory of Chapter 4 works smoothly using $\bar{d}(t)$, but in examples it is more natural to use $d(t)$, which we shall do in this chapter. Clearly, (7.20) implies the same result for $\bar{d}$ and implies $\tau_{1}\sim c_{n}$. Also, our convention in this chapter is to work in discrete time, whereas the Chapter 4 general theory worked more smoothly in continuous time. (Clearly (7.20) in discrete time implies the same result for the continuized chains, provided $c_{n}\rightarrow\infty$). Note that, in the context of symmetric reversible chains,

 $d(t)=d_{i}(t)=||P_{i}(X_{t}\in\cdot)-\pi(\cdot)||\mbox{ for each }i.$

We also can discuss separation distance (Chapter 4 yyy) which in this context is

 $s(t)=1-n\min_{j}P_{i}(X_{t}=j)\mbox{ for each }i,$

and introduce the analogous notion of separation threshold.

It turns out that these cut-offs automatically appear in sequences of chains defined by repeated products. An argument similar to the analysis of the $d$-cube (see [8] for a slightly different version) shows

###### Lemma 7.12

Fix an aperiodic symmetric reversible chain with $m$ states and with relaxation time $\tau_{2}=1/(1-\lambda_{2})$. Consider the $d$-fold product chain with $n=m^{d}$ states and transition probabilities

 $(x_{1},\ldots,x_{d})\rightarrow(x_{1},\ldots,y_{i},\ldots,y_{d})\mbox{ : % probability }\frac{1}{d}\ p_{x_{i},y_{i}}.$

As $d\rightarrow\infty$, this sequence of chains has variation cutoff $\frac{1}{2}\tau_{2}d\log d$ and separation cut-off $\tau_{2}d\log d$.

xxx discuss upper bound lemma

xxx heuristics

xxx mention later examples

## 7.1.8 Vertex-transitive graphs and Cayley graphs

So far we have worked in the setting of symmetric reversible chains, and haven’t used any graph theory. We now specialize to the case of random walk on a vertex-transitive or Cayley graph $(\mbox{{\cal V}},\mbox{{\cal E}})$. As usual, we won’t write out all specializations of the previous results, but instead emphasize what extra we get from graph-theoretic arguments. Let $d$ be the degree of the graph.

###### Lemma 7.13

For random walk on a vertex-transitive graph,

(i) $E_{v}T_{x}\geq n\mbox{ if }(v,x)\not\in\mbox{{\cal E}}$

(ii) $\frac{2dn}{d+1}-d\geq E_{v}T_{x}\geq\frac{dn}{d+1}\mbox{ if }(v,x)\in\mbox{{% \cal E}}$

Proof. The lower bounds are specializations of Lemma 7.2(i), i.e. of Chapter 6 xxx. For the upper bound in (ii),

 $\displaystyle n-1$ $\displaystyle=$ $\displaystyle\frac{1}{d}\sum_{y\sim x}E_{y}T_{x}$ (7.21) $\displaystyle\geq$ $\displaystyle\frac{1}{d}\left(E_{v}T_{x}+(d-1)\frac{dn}{d+1}\right)\mbox{ by % the lower bound in (ii)}.$

Rearrange.

xxx mention general lower bound $\tau_{0}\geq(1-o(1))nd/(d-2)$ via tree-cover.

It is known (xxx ref) that a Cayley graph of degree $d$ is $d$-edge-connected, and so Chapter 6 Proposition yyy gives

 $\tau^{*}\leq n^{2}\psi(d)/d$

where $\psi(d)/d\approx\sqrt{2/d}$.

###### Example 7.14

A Cayley graph where $E_{v}T_{w}$ is not the same for all edges $(v,w)$.

Consider $Z_{m}\times Z_{2}$ with generators $(1,0),(-1,0),(0,1)$. The figure illustrates the case $m=4$.

Let’s calculate $E_{00}T_{10}$ using the resistance interpretation. Put unit voltage at $10$ and zero voltage at $00$, and let $a_{i}$ be the voltage at $i0$. By symmetry the voltage at $i1$ is $1-a_{i}$, so we get the equations

 $a_{i}=\frac{1}{3}(a_{i-1}+a_{i+1}+(1-a_{i})),\ 1\leq i\leq m-1$

with $a_{0}=a_{m}=0$. But this is just a linear difference equation, and a brief calculation gives the solution

 $a_{i}=\frac{1}{2}-\ \frac{1}{2}\frac{\theta^{m/2-i}+\theta^{i-m/2}}{\theta^{m/% 2}+\theta^{-m/2}}$

where $\theta=2-\sqrt{3}$. The current flow is $1+2a_{1}$, so the effective resistance is $r=(1+2a_{1})^{-1}$. The commute interpretation of resistance gives $2E_{00}T_{01}=3nr$, and so

 $E_{00}T_{01}=\frac{3n}{2(1+2a_{1})}$

where $n=2m$ is the number of vertices. In particular,

 $n^{-1}\ E_{00}T_{01}\rightarrow\gamma\equiv\frac{3}{1+\sqrt{3}}\mbox{ as }n% \rightarrow\infty.$

Using the averaging property (7.21)

 $n^{-1}\ E_{00}T_{10}\rightarrow\gamma^{\prime}\equiv\frac{3\sqrt{3}}{2(1+\sqrt% {3})}\mbox{ as }n\rightarrow\infty.$

Turning from hitting times to mixing times, recall the Cheeger constant

 $\tau_{c}\equiv\sup_{A}c(A)$

where $A$ is a proper subset of vertices and

 $c(A)\equiv\frac{\pi(A^{c})}{P_{\pi}(X_{1}\in A^{c}|X_{0}\in A)}.$

For random walk on a Cayley graph one can use simple “averaging” ideas to bound $c(A)$. This is Proposition 7.15 below. The result in fact extends to vertex-transitive graphs by a covering graph argument - see xxx.

Consider a $n$-vertex Cayley graph with degree $d$ and generators $\mbox{{\cal G}}=\{g_{1},\ldots,g_{d}\}$, where $g\in\mbox{{\cal G}}$ implies $g^{-1}\in\mbox{{\cal G}}$. Then

 $P_{\pi}(X_{1}\in A^{c}|X_{0}\in A)=\frac{1}{d}\sum_{g\in\mbox{{\cal G}}}% \frac{|Ag\setminus A|}{|A|}$

where $Ag=\{ag:a\in A\}$. Lower bounding the sum by its maximal term, we get

 $c(A)\leq\frac{d}{n}\ \frac{|A|\ |A^{c}|}{\max_{g\in\mbox{{\cal G}}}|Ag% \setminus A|}.$ (7.22)
###### Proposition 7.15

On a Cayley graph of degree $d$

(i) $\tau_{c}\leq d\Delta$, where $\Delta$ is the diameter of the graph.

(ii) $c(A)\leq 2d\rho(A)$ for all $A$ with $\rho(A)\geq 1$, where

 $\rho(A)\equiv\min_{v\in\mbox{{\cal V}}}\max_{w\in A}d(v,w)$

is the radius of $A$.

Note that $\sup_{A}\rho(A)$ is bounded by $\Delta$ but not in general by $\Delta/2$ (consider the cycle), so that (ii) implies (i) with an extra factor of $2$. Part (i) is from Aldous [16] and (ii) is from Babai [36].

Proof. (i) Fix $A$. Because

 $\frac{1}{n}\sum_{v\in\mbox{{\cal V}}}|A\cap Av|=|A|^{2}/n$

there exists some $v\in\mbox{{\cal V}}$ such that $|A\cap Av|\leq|A|^{2}/n$, implying

 $|Av\setminus A|\geq|A||A^{c}|/n.$ (7.23)

We can write $v=g_{1}g_{2}\ldots g_{\delta}$ for some sequence of generators $(g_{i})$ and some $\delta\leq\Delta$, and

 $|Av\setminus A|\leq\sum_{i=1}^{\delta}|Ag_{1}\ldots g_{i}\setminus Ag_{1}% \ldots g_{i-1}|=\sum_{i=1}^{\delta}|Ag_{i}\setminus A|.$

So there exists $g\in\mbox{{\cal G}}$ with $|Ag\setminus A|\geq\frac{1}{\Delta}\times|A||A^{c}|/n$, and so (i) follows from (7.22). For part (ii), fix $A$ with $|A|\leq n/2$, write $\rho=\rho(A)$ and suppose

 $\max_{g\in\mbox{{\cal G}}}|Ag\setminus A|<\frac{1}{4\rho}|A|.$ (7.24)

Fix $v$ with $\max_{w\in A}d(w,v)=\rho$. Since $|Ag\setminus A|<\frac{1}{4\rho}|A|$ and

 $A\setminus Axg\subseteq(A\setminus Ag)\cup(A\setminus Ax)g$

we have by induction

 $|A\setminus Ax|<\frac{1}{4\rho}|A|d(x,v).$ (7.25)

Write $B^{r}\equiv\{vg_{1}\ldots g_{i};i\leq r,g_{i}\in\mbox{{\cal G}}\}$ for the ball of radius $r$ about $v$. Since $(2\rho+1)/(4\rho)<1$, inequality (7.25) shows that $A\cap Ax$ is non-empty for each $x\in B^{2\rho+1}$, and so $B^{2\rho+1}\subseteq A^{-1}A$. But by definition of $\rho$ we have $A\subseteq B^{\rho}$, implying $B^{2\rho+1}\subseteq B^{2\rho}$, which in turn implies $B^{2\rho}$ is the whole group. Now (7.25) implies that for every $x$

 $|Ax\setminus A|<\frac{1}{2}|A|\leq\frac{|A||A^{c}|}{n}.$

But this contradicts (7.23). So (7.24) is false, i.e.

 $\max_{g\in\mbox{{\cal G}}}|Ag\setminus A|\geq\frac{1}{4\rho}|A|\geq\frac{1}{% 2\rho}\ \frac{|A||A^{c}|}{n}.$

By complementation the final inequality remains true when $|A|>n/2$, and the result follows from (7.22).

## 7.1.9 Comparison arguments for eigenvalues

The “distinguished paths” method of bounding relaxation times (Chapter 4 yyy) can also be used to compare relaxation times of two random flights on the same group, and hence to bound one “unknown” relaxation time in terms of a second “known” relaxation time. This approach has been developed in great depth in

xxx ref Diaconis Saloff-Coste papers.

Here we give only the simplest of their results, from [115].

Consider generators ${\cal G}$ of a group $I$, and consider a reversible random flight with step-distribution $\mu$ supported on ${\cal G}$. Write $d(x,\mbox{id})$ for the distance from $x$ to the identity in the Cayley graph, i.e. the minimal length of a word

 $x=g_{1}g_{2}\ldots g_{d};\ g_{i}\in\mbox{{\cal G}}.$

For each $x$ choose some minimal-length word as above and define $N(g,x)$ to be the number of occurences of $g$ in the word. Now consider a different reversible random flight on $I$ with some step-distribution $\tilde{\mu}$, not necessarily supported on ${\cal G}$. If we know $\tilde{\tau_{2}}$, the next result allows us to bound $\tau_{2}$.

###### Theorem 7.16
 $\frac{\tau_{2}}{\tilde{\tau_{2}}}\leq K\equiv\max_{g\in\mbox{{\cal G}}}\frac% {1}{\mu(g)}\sum_{x\in I}d(x,\mbox{id})N(g,x)\tilde{\mu}(x).$

xxx give proof – tie up with $L^{2}$ discussion

Perhaps surprisingly, Theorem 7.16 gives information even when the comparison walk is the “trivial” walk whose step-distribution $\tilde{\mu}$ is uniform on the group. In this case, both $d(x,\mbox{id})$ and $N(g,x)$ are bounded by the diameter $\Delta$, giving

###### Corollary 7.17

For reversible flight with step-distribution $\mu$ on a group $I$,

 $\tau_{2}\leq\frac{\Delta^{2}}{\min_{g\in\mbox{{\cal G}}}\mu(g)},$

where ${\cal G}$ is the support of $\mu$ and $\Delta$ is the diameter of the Cayley graph associated with ${\cal G}$.

When $\mu$ is uniform on ${\cal G}$ and $|\mbox{{\cal G}}|=d$, the Corollary gives the bound $d\Delta^{2}$, which improves on the bound $8d^{2}\Delta^{2}$ which follows from Proposition 7.15 and Cheeger’s inequality (Chapter 4 yyy). The examples of the torus $Z^{d}_{N}$ show that $\Delta^{2}$ enters naturally, but one could hope for the following variation.

###### Open Problem 7.18

Write $\tau_{*}=\tau_{*}(I,\mbox{{\cal G}})$ for the minimum of $\tau_{2}$ over all symmetric random flights on $I$ with step-distribution supported on ${\cal G}$. Is it true that $\tau_{*}=O(\Delta^{2})$?