Let’s jump into the details and defer the discussion until later. Consider a finite irreducible discrete-time Markov chain with transition matrix , and note we are not assuming reversibility. We can identify with a weighted directed graph, which has (for each with ) a directed edge with weight . A directed spanning tree is a spanning tree with one vertex distinguished as the root, and with each edge of regarded as being directed towards the root. Write for the set of directed spanning trees. For define
Normalizing gives a probability distribution on :
Now fix and consider the stationary Markov chain run from time minus infinity to time . We now use the chain to construct a random directed spanning tree . The root of is . For each there was a final time, say, before that the chain visited :
Define to consist of the directed edges
So the edges of are the last-exit edges from each vertex (other than the root ). It is easy to check that is a directed spanning tree.
Now consider what happens as changes. Clearly the process is a stationary Markov chain on , with a certain transition matrix , say. The figure below indicates a typical transition . Here was constructed by the chain finishing at its root , and is the new tree obtained when the chain makes a transition .
The stationary distribution of is .
Proof. Fix a directed spanning tree . We have to verify
(9.13) |
Write for the root of . For each vertex there is a tree constructed from by adding an edge and then deleting from the resulting cycle the edge (say, for some ) leading into . For set . It is easy to see that the only possible transitions into are from the trees , and that
Thus the left side of (9.13) becomes
The underlying chain can be recovered from the tree-valued chain via , so we can recover the stationary distribution of from the stationary distribution of , as follows.
For each vertex define
Then is the stationary distribution of the original chain .
See the Notes for comments on this classical result.
Theorem 9.10 and the definition of come close to specifying an algorithm for constructing a random spanning tree with distribution . Of course the notion of running the chain from time until time doesn’t sound very algorithmic, but we can rephrase this notion using time-reversal. Regarding the stationary distribution as known, the time-reversed chain has transition matrix . Here is the restatement of Theorem 9.10 in terms of the time-reversed chain.
Let be the time-reversed chain, run until the cover time . Define to be the directed spanning tree with root and with edges . If has distribution then has distribution . If is deterministically , say, then has distribution conditioned on being rooted at .
Thus consists of the edges by which each vertex is first visited, directed backwards.
For a reversible chain, we can of course use the chain itself in Corollary 9.12 above, in place of the time-reversed chain. If the chain is random walk on a unweighted graph , then
where is the degree of in . So , restricted to the set of spanning trees with specified root , is uniform on that set. In this setting, Corollary 9.12 specializes as follows.
Let be random walk on an unweighted graph , started at and run until the cover time . Define to be the directed spanning tree with root and with edges . Then is uniform on the set of all directed spanning trees of rooted at .
We can rephrase this. If we just want “plain” spanning trees without a root and directions, then the above, regarded as a plain spanning tree, is uniform on the set of all plain spanning trees. On the other hand, if we want a rooted spanning tree which is uniform on all such trees without prespecified root, the simplest procedure is to construct as in Corollary 9.13 with deterministic start , and at the end re-root at a uniform random vertex. (This is slightly subtle – we could alternatively start with uniform, which is typically not the stationary distribution .)
Using the bounds on cover time developed in Chapter 6, we now have an algorithm for generating a uniform spanning tree of a -vertex graph in steps (and steps on a regular graph). No other known algorithm achieves these bounds.
The ideas in this subsection (and much more) are treated in a long but very readable survey paper by Pemantle [279], which we encourage the interested reader to consult. As observed above, in the reversible setting we have the obvious simplification that we can construct uniform spanning trees using the chain itself. Deeper results can be found using the electrical network analogy. Consider random walk on a weighted graph . The random spanning tree constructed by Corollary 9.12, interpreted as a “plain” spanning tree, has distribution
where is the normalizing constant. If an edge is essential, it must be in every spanning tree, so . If the edge is inessential, the probability will be strictly between and . Intuitively, should provide a measure of “how nearly essential is”. This should remind the reader of the inessential edge inequality (yyy). Interpreting the weighted graph as an electrical network where an edge has resistance , the effective resistance between and satisfies
For each edge ,
Note that in a -vertex graph, has exactly edges, so Proposition 9.14 implies Foster’s theorem (Chapter 3 yyy)
Proof. Consider the random walk started at and run until the time of the first return to after the first visit to . Let be the chance that , i.e. that the return to is along the edge . We can calculate in two ways. In terms of random walk started at , is the chance that the first visit to is from , and so by Corollary 9.12 (applied to the walk started at ) . On the other hand, consider the walk started at and let be the first time that the walk traverses in that direction. Then
But by yyy and yyy
and hence as required.
The next result indicates the usefulness of the electrical network analogy.
For any two edges ,
Proof. Consider the “shorted” graph in which the end-vertices of are shorted into a single vertex , with edge-weights . The natural correspondence between spanning trees of and spanning trees of containing maps the distribution to the conditional distribution . So, writing for the random spanning tree associated with ,
But, setting , Proposition 9.14 shows
By Rayleigh’s monotonicity principle, , and the result follows.