# 9.7 Notes on Chapter 9

Section 9.1. The idea of a maximal coupling goes back to Goldstein [169]: see Lindvall [234] for further history. Strong stationary times were studied in detail by Diaconis - Fill [113, 112] and Fill [150, 151], with particular attention to the case of one-dimensional stochastically monotone chains where there is some interesting “duality” theory. The special case of random walks on groups had previously been studied in Aldous - Diaconis [7, 8], and the idea is implicit in the regenerative approach to time-asymptotics for general state space chains, discussed at xxx. The theory surrounding Theorem 9.4 goes back to Rost [301]. This is normally regarded as part of the potential theory of Markov chains, which emphasizes analogous results in the transient setting, and the recurrent case is rather a sideline in that setting. See Revuz [289] sec. 2.5 or Dellacherie - Meyer [108] Chapter 9 sec. 3 for textbook treatments in the general-space setting. The observation that the theory applied in simple finite examples such as those in section 9.1.3 was made in Lovasz - Winkler [240], from whom we borrowed the phrase halting state. Monotonicity properties like that in the statement of Corollary 9.5 were studied in detail by Brown [73] from the viewpoint of approximate exponentiality of hitting times.

Section 9.2. A slightly more sophisticated and extensive textbook treatment of these topics is in Lyons [249]. The nomenclature reflects my taste: Theorem 9.10 is “the underlying theorem” which implies “the formula” for the stationary distribution in terms of weighted spanning trees. Different textbooks (e.g. [178] p. 340 xxx more refs) give rather different historical citations for the Markov chain tree formula, and in talks I often call it “the most often rediscovered result in probability theory”: it would be an interesting project to track down the earliest explicit statement. Of course it can be viewed as part of a circle of ideas (including the matrix-tree theorem for the number of spanning trees in a graph) which is often traced back to Kirchoff. The fact that Theorem 9.10 underlies the formula was undoubtably folklore for many years (Diaconis attributes it to Peter Doyle, and indeed it appears in an undergraduate thesis [317] of one of his students), but was apparently not published until the paper of Anantharam and Tsoucas [30]. The fact that the Markov chain tree theorem can be interpreted as an algorithm for generating uniform random spanning trees was observed by Aldous [19] and Broder [70], both deriving from conversations with Diaconis. [19] initiated study of theoretical properties of uniform random spanning trees, proving e.g. the following bounds on the diameter $\Delta$ of the random tree in a regular $n$-vertex graph.

 $\frac{n^{1/2}}{K_{1}\tau_{2}\log n}\leq E\Delta\leq K_{2}\tau_{2}^{1/2}n^{1/2}\log n$ (9.24)

where $K_{1}$ and $K_{2}$ are absolute constants. Loosely, “in an expander, a random spanning tree has diameter $n^{1/2\pm o(1)}$”. Results on asymptotic Poisson distribution for the degrees in a random spanning tree are given in Aldous [19], Pemantle [279] and Pemantle and Burton [83]. Pemantle [278] discusses the analog of uniform random spanning trees on the infinite $d$-dimensional lattice, and Aldous and Larget [9] give simulation results on quantitative behavior on the $d$-dimensional torus.

Section 9.2.2. As described in Pemantle [279] and Burton and Pemantle [83], the key to deeper study of random spanning trees is

###### Theorem 9.28 (Transfer-impedance theorem)

Fix a graph $G$. There is a symmetric function $H(e_{1},e_{2})$ on pairs of edges in $G$ such that for any edges $(e_{1},\ldots,e_{r})$

 $P(e_{i}\in{\bf T}\mbox{ for all }1\leq i\leq r)={\rm det}\ M(e_{1},\ldots,e_{r})$

where $M(e_{1},\ldots,e_{r})$ is the matrix with entries $H(e_{i},e_{j}),1\leq i,j\leq r$.

Section 9.3. The first “pure simulation” algorithm for sampling exactly from the stationary distribution was given by Asmussen et al [34], using a quite different idea, and lacking explicit time bounds.

Section 9.3.1. In our discussion of these algorithms, we are assuming that we have a list of all states. Lovasz - Winkler [241] gave the argument in a slightly different setting, where the algorithm can only “address” a single state, and their bound involved $\max_{ij}E_{i}T_{j}$ in place of $\tau_{1}^{*}$.

Section 9.3.3. Letac [229] gives a survey of the “backwards coupling” method for establishing convergence of continuous-space chains: it suffices to show there exists a r.x. $X^{-\infty}$ such that $X^{(x,s)}_{0}\rightarrow X^{-\infty}$ a.s. as $s\rightarrow-\infty$, for each state $x$. This method is especially useful in treating matrix-valued chains of the form $X_{t}=A_{t}X_{t-1}+B_{t}$, where $(A_{t},B_{t}),t\geq 1$ are i.i.d. random matrices. See Barnsley and Elton [43] for a popular application.

Section 9.4.1. One result on spectra and reversibilizations is the following. For a transition matrix ${\bf P}$ write

 $\tau({\bf P})=\sup\{\frac{1}{1-|\lambda|}:\ \lambda\neq 1\mbox{ an eigenvalue % of }{\bf P}\}.$

Then for the additive reversibilization ${\bf Q}^{(1)}=\frac{1}{2}({\bf P}+{\bf P}^{*})$ we have (e.g. [316] Proposition 1)

 $\tau({\bf P})\leq 2\tau({\bf Q}^{(1)}).$