For a finite set , there is a close connection between
(a) having an explicit formula for the size
(b) having a bounded-time algorithm for generating a uniform random
element of .
As an elementary illustration,
we all know that there are permutations of objects.
From a proof of this fact, we could write down an explicit
mapping between the set of permutations and the
set .
Then we could simulate a uniform random permutation by first simulating
a uniform random element of
and then computing .
Conversely, given an algorithm which was guaranteed to produce
a uniform random permutation after calls to a random number
generator, we could (in principle) analyze the working of the
algorithm in order to calculate the chance of getting the
identity permutation.
Then we can say that number of permutations equals .
A more subtle observation is that, in certain settings, having an algorithm for generating an approximately uniform random element of can be used to estimate approximately the size . The idea is to estimate successive ratios by sampling. Suppose we can relate to smaller sets
(10.20) |
where is known, the ratios are bounded away from , and where we can sample uniformly from each . Then take uniform random samples from each and find the sample proportion which fall into . Because , we use
as an estimate of . To study its accuracy, it is simpler to consider . Clearly , and we can calculate the variance by
The simplest case is where we know a theoretical lower bound for the . Then by taking we get
In other words, with a total number
(10.21) |
of random samples, we can statistically estimate to within a factor .
The conceptual point of invoking intermediate sets is that the overall ratio may be exponentially small in some size parameter, so that trying to estimate this ratio directly by sampling from would involve order , i.e. exponentially many, samples. If we can specify the intermediate sets with ratios bounded away from and then and so the number of samples required in (10.21) depends on instead of .
The discussion so far has not involved Markov chains. From our viewpoint, the interesting setting is where we cannot directly get uniform random samples from a typical , but instead need to use Markov chain Monte Carlo. That is, on each we set up a reversible Markov chain with uniform stationary distribution (i.e. a chain whose transition matrix is symmetric in the sense ) Assume we have a bound on the -values of all these chains. Then as a small modification of Corollary 10.3, one can get samples from the combined chains whose joint distribution is close (in variation distance) to the the distribution of independent samples from the uniform distributions in steps. As above, if we can specify the intermediate sets with ratios bounded away from and then we need samples, and so (ignoring dependence on ) the total number of steps in all the chains is .
In summary, to implement this method of approximate counting via Markov chains, one needs
a way to specify the intermediate sets (10.20)
a way to specify Markov chains on the whose mixing times can be rigorously bounded.
Two particular examples have been studied in detail, and historically these examples provided major impetus for the development of technical tools to estimate mixing times. Though the details are too technical for this book, we outline these examples in the next two sections, and then consider in detail the setting of self-avoiding walks.
Given a closed convex set in , for large , how can we algorithmically calculate the volume of ? Regard as being described by an oracle, that is for any we can determine in one step whether or not . Perhaps surprisingly, there is no known deterministic algorithm which finds approximately (i.e. to within a factor ) in a polynomial in number of steps. But this problem is amenable to “approximate counting via Markov chains” technique. This line of research was initiated by Dyer et al [134, 135] who produced an algorithm requiring steps. A long sequence of papers (see the Notes) studied variants of both the Markov chains and the analytic techniques in order to reduce the polynomial degree. Currently the best bound is , due to Kannan et al [206].
To outline the procedure in this example, suppose we know , where is the ball of radius . (It turns out that one can transform any convex set into one satisfying these constraints with .) We specify an increasing sequence of convex subsets
by setting . This makes the ratios of successive volumes bounded by and requires intermediate sets. So the issue is to design and analyze a chain on a typical convex set whose stationary distribution is uniform. Various Markov chains have been used: simple random walk on a fine discrete lattice restricted to , or spherically symmetric walks. The analysis of the chains has used Cheeger inequalities for chains and the refinement of classical isoperimetric inequalities for convex sets. Recent work of Bubley et al [78] has successfully introduced coupling methods, and it is a challenging problem to refine these coupling methods. There is a suggestive analogy with theoretical study of Brownian motion in a convex set – see Chapter 13 section 1.3 (yyy 7/29/99 version).
For a finite, not necessarily regular, graph let be the set of all matchings in , where a matching is a subset of edges such that no vertex is in more than one edge of . Suppose we want to count (for the harder setting of counting perfect matchings see the Notes). Enumerate the edges of as , where is the number of edges of . Write for the graph with edges deleted. A matching of can be identified with a matching of which does not contain , so we can write
Since has one edge, we know . The ratio is the probability that a uniform random matching of does not contain the edge . So the issue is to design and analyze a chain on a typical set of matchings whose stationary distribution is uniform. Here is a natural such chain. From a matching , pick a uniform random edge of , and construct a new matching from and as follows.
If then set .
If neither end-vertex of is in an edge of then set .
If exactly one end-vertex of is in an edge ( say) of then set .
This construction (the idea goes back to Broder [69]) yields a chain with symmetric transition matrix, because each possible transition has chance . An elegant analysis by Jerrum and Sinclair [197], outlined in Jerrum [199] section 5.1, used the distinguished paths technique to prove that on a -vertex -edge graph
Since the number of matchings can be bounded crudely by ,
(10.22) |
xxx to be written