Chapter 4 Theorem yyy involved three mixing time parameters; related to variation distance to stationarity, related to “separation” from stationarity, and related to stationary times (see below). In Chapter 4 these parameters were defined under worst-case initial distributions, and our focus was on “equivalence” of these parameters for reversible chains. Here we discuss underlying “exact” results. Fix an initial distribution . Then associated with each notion of mixing, there is a corresponding construction of a minimal random time , stated in Theorems 9.1 - 9.3 below.
xxx randomized stopping times
Call a stopping time a strong stationary time if
(9.1) |
i.e. if has distribution and is independent of . Call a stopping time a stationary time if
(9.2) |
Call a random time a coupling time if we can construct a joint distribution such that is the chain with initial distribution , is the stationary chain, and . (A coupling time need not be a stopping time, even w.r.t. the joint process; this is the almost the only instance of a random time which is not a stopping time that we encounter in this book.)
Recall from yyy the notion of separation of from :
Write for the separation at time when the initial distribution was :
Similarly write for the variation distance from stationarity at time :
Let be any strong stationary time for the -chain. Then
(9.3) |
Moreover there exists a minimal strong stationary time for which
(9.4) |
For any coupling time ,
Moreover there exists a minimal coupling time for which
For any stationary time ,
(9.5) |
Moreover there exist mean-minimal stationary times for which
(9.6) |
In each case, the first assertion is immediate from the definitions, and the issue is to carry out a construction of the required . Despite the similar appearance of the results, attempts to place them all in a common framework have not been fruitful. We will prove Theorems 9.1 and 9.3 below, and illustrate with examples. These two proofs involve only rather simple “greedy” constructions. We won’t give the proof of Theorem 9.2 (the construction is usually called the maximal coupling: see Lindvall [234]) because the construction is a little more elaborate and the existence of the minimal coupling time is seldom useful, but on the other hand the coupling inequality in Theorem 9.2 will be used extensively in Chapter 14. In the context of Theorems 9.1 and 9.2 the minimal times are clearly unique in distribution, but in Theorem 9.3 there will generically be many mean-minimal stationary times with different distributions.
For comparison with the other two results, we stated Theorem 9.3 in terms of stopping times at which the stationary distribution is attained, but the underlying result (amplified as Theorem 9.4) holds for an arbitrary target distribution . So fix as well as the initial distribution . Call a stopping time admissible if . Write for the inf of over all admissible stopping times .
(a) .
(b) The “filling scheme” below defines an admissible stopping time such that .
(c) Any admissible stopping time with the property
(9.10) |
satisfies .
Part (c) is rather remarkable, and can be rephrased as follows. Call a state with property (9.10) a halting state for the stopping time . In words, the chain must stop if and when it hits a halting state. Then part (c) asserts that, to verify that an admissible time attains the minimum , it suffices to show that there exists some halting state. In the next section we shall see this is very useful in simple examples.
Proof. The greedy construction used here is called a filling scheme. Recall from (9.7) the definitions
Write also . We now define and the associated stopping time inductively via (9.8) and
In words, we stop at the current state (, say) provided our “quota” for the chance of stopping at has not yet been filled. Clearly
(9.11) |
We now claim that satisfies property (9.10). To see this, consider
Then (9.10) holds by construction for any such that . In particular, a.s. and then by (9.11) . So is an admissible stopping time.
Remark. Generically we expect for exactly one state , though other possibilities may occur, e.g. in the presence of symmetry.
Now consider an arbitrary admissible stopping time , and consider the associated occupation measure :
We shall show
(9.12) |
Indeed, by counting the number of visits during in two ways,
Chapter 2 Lemma yyy showed the (intuitively obvious) fact
So summing over ,
and (9.12) follows.
Write for the occupation measure associated with the stopping time produced by the filling scheme. By (9.10), . If and are solutions of (9.12) then the difference satisfies and so is a multiple of the stationary distribution . In particular, if is the occupation measure for some arbitrary admissible time , then
Since , we have established parts (b) and (c) of the theorem, and
To prove (a), choose a state such that , that is such that . Then and hence . But for any admissible stopping time and any state
giving the reverse inequality .
The minimal strong stationary time has mean , i.e. is mean-minimal amongst all not-necessarily-strong stationary times, iff there exists a state such that
Proof. From the construction of the minimal strong stationary time, this is the condition for to be a halting state.
Patterns in coin-tossing.
Recall Chapter 2 Example yyy: is the chain on the set of -tuples . Start at some arbitrary initial state . Here the deterministic stopping time “” is a strong stationary time. Now a state will be a halting state provided it does not overlap , that is provided there is no such that . But the number of overlapping states is at most , so there exists a non-overlapping state, i.e. a halting state. So attains the minimum () of over all stationary times (and not just over all strong stationary times).
Top-to-random card shuffling.
Consider the following scheme for shuffling an -card deck: the top card is removed, and inserted in one of the possible positions, chosen uniformly at random. Start in some arbitrary order. Let be the first time that the card which was originally second-from-bottom has reached the top of the deck. Then it is not hard to show (Diaconis [123] p. 177) that is a strong stationary time. Now any configuration in which the originally-bottom card is the top card will be a halting state, and so is mean-minimal over all stationary times. Here .
The winning streak chain.
In a series of games which you win or lose independently with chance , let be your current “winning streak”, i.e. the number of games won since your last loss. For fixed , the truncated process is the Markov chain on states with transition probabilities
and stationary distribution
We present this chain, started at , as an example where it is easy to see there are different mean-minimal stationary times . We’ll leave the simplest construction until last – can you guess it now? First consider , where has the stationary distribution. This is a stationary time, and is a halting state, so it is mean-minimal. Now it is easy to show
(Slick proof: in the not-truncated chain,
Chapter 2 Lemma yyy says
.)
So
Here is another stopping time which is easily checked to attain the stationary distribution, for the chain started at . With chance stop at time . Otherwise, run the chain until either hitting (in which case, stop) or returning to . In the latter case, the return to occurs as a transition to from some state . Continue until first hitting , then stop. Again is a halting state, so this stationary time also is mean-minimal. Of course, the simplest construction is the deterministic time . This is a strong stationary time (the winning streak chain is a function of the patterns in coin tossing chain), and again is clearly a halting state. Thus without needing the calculation above.
Remark. One could alternatively use Corollary 9.5 to show that the strong stationary times in Examples 9.6 and 9.7 are mean-minimal stationary times. The previous examples are atypical: here is a more typical example in which the hypothesis of Corollary 9.5 is not satisfied and so no mean-optimal stationary time is a strong stationary time.
xxx needs a name!
Chapter 2 Example yyy can be rewritten as follows. Let be independent uniform on and let be independent events with . Define a chain on by
The stationary distribution is the uniform distribution. Take . Clearly is a strong stationary time, and , and it is easy to see that is the minimal strong stationary time. But is not a mean-minimal stationary time. The occupation measure associated with is such that , and so the occupation measure associated with a mean-minimal stationary time is , and so .