Chapter 4 Theorem yyy involved three mixing time parameters; $\tau_{1}$ related to variation distance to stationarity, $\tau_{1}^{(1)}$ related to “separation” from stationarity, and $\tau_{1}^{(2)}$ related to stationary times (see below). In Chapter 4 these parameters were defined under worst-case initial distributions, and our focus was on “equivalence” of these parameters for reversible chains. Here we discuss underlying “exact” results. Fix an initial distribution $\mu$. Then associated with each notion of mixing, there is a corresponding construction of a minimal random time $T$, stated in Theorems 9.1 - 9.3 below.

xxx randomized stopping times

Call a stopping time $T$ a strong stationary time if

$P_{\mu}(X_{t}=j,T=t)=\pi_{j}P_{\mu}(T=t)\mbox{ for all }j,t$ | (9.1) |

i.e. if $X_{T}$ has distribution $\pi$ and is independent of $T$. Call a stopping time $T$ a stationary time if

$P_{\mu}(X_{T}=j)=\pi_{j}\mbox{ for all }j.$ | (9.2) |

Call a random time $T$ a coupling time if we can construct a joint distribution $((X_{t},Y_{t});t\geq 0)$ such that $(X_{t})$ is the chain with initial distribution $\mu$, $(Y_{t})$ is the stationary chain, and $X_{t}=Y_{t},t\geq T$. (A coupling time need not be a stopping time, even w.r.t. the joint process; this is the almost the only instance of a random time which is not a stopping time that we encounter in this book.)

Recall from yyy the notion of separation of $\theta$ from $\pi$:

$\mbox{sep}(\theta)\equiv\min\{u:\theta_{j}\geq(1-u)\pi_{j}\ \forall j\}.$ |

Write $\mbox{sep}_{\mu}(t)$ for the separation at time $t$ when the initial distribution was $\mu$:

$\mbox{sep}_{\mu}(t)=\min\{u:P_{\mu}(X_{t}=j)\geq(1-u)\pi_{j}\ \forall j\}.$ |

Similarly write $\mbox{vd}_{\mu}(t)$ for the variation distance from stationarity at time $t$:

$\mbox{vd}_{\mu}(t)=\frac{1}{2}\sum_{j}|P_{\mu}(X_{t}=j)-\pi_{j}|.$ |

Let $T$ be any strong stationary time for the $\mu$-chain. Then

${\rm sep}_{\mu}(t)\leq P_{\mu}(T>t)\mbox{ for all }t\geq 0.$ | (9.3) |

Moreover there exists a minimal strong stationary time $T$ for which

${\rm sep}_{\mu}(t)=P_{\mu}(T>t)\mbox{ for all }t\geq 0.$ | (9.4) |

For any coupling time $T$,

${\rm vd}_{\mu}(t)\leq P_{\mu}(T>t)\mbox{ for all }t\geq 0.$ |

Moreover there exists a minimal coupling time $T$ for which

${\rm vd}_{\mu}(t)=P_{\mu}(T>t)\mbox{ for all }t\geq 0.$ |

For any stationary time $T$,

$E_{\mu}T\geq\max_{j}(E_{\mu}T_{j}-E_{\pi}T_{j}).$ | (9.5) |

Moreover there exist mean-minimal stationary times $T$ for which

$E_{\mu}T=\max_{j}(E_{\mu}T_{j}-E_{\pi}T_{j}).$ | (9.6) |

In each case, the first assertion is immediate from the definitions, and the issue is to carry out a construction of the required $T$. Despite the similar appearance of the results, attempts to place them all in a common framework have not been fruitful. We will prove Theorems 9.1 and 9.3 below, and illustrate with examples. These two proofs involve only rather simple “greedy” constructions. We won’t give the proof of Theorem 9.2 (the construction is usually called the maximal coupling: see Lindvall [234]) because the construction is a little more elaborate and the existence of the minimal coupling time is seldom useful, but on the other hand the coupling inequality in Theorem 9.2 will be used extensively in Chapter 14. In the context of Theorems 9.1 and 9.2 the minimal times $T$ are clearly unique in distribution, but in Theorem 9.3 there will generically be many mean-minimal stationary times $T$ with different distributions.

For any stopping time $T$, define

$\theta_{j}(t)=P_{\mu}(X_{t}=j,T\geq t),\ \ \ \ \sigma_{j}(t)=P_{\mu}(X_{t}=j,T% =t).$ | (9.7) |

Clearly these vectors satisfy

$0\leq\sigma(t)\leq\theta(t),\ \ (\theta(t)-\sigma(t)){\bf P}=\theta(t+1)\ % \forall t;\ \ \theta_{0}=\mu.$ | (9.8) |

Conversely, given $(\theta(t),\sigma(t);t\geq 0)$ satisfying (9.8), we can construct a randomized stopping time $T$ satisfying (9.7) by declaring that $P(T=t|X_{t}=j,T\geq t,X_{s},s<t)=\sigma_{j}(t)/\theta_{j}(t)$. The proofs of Theorems 9.1 and 9.3 use different definitions of vectors satisfying (9.8).

Proof of Theorem 9.1. A particular sequence $(\theta(t),\sigma(t);t\geq 0)$ can be specified inductively by (9.8) and

$\sigma(t)=r_{t}\pi\ ,\ \ \mbox{ where }r_{t}=\min_{j}\theta_{j}(t)/\pi_{j}.$ | (9.9) |

The associated stopping time satisfies

$P_{\mu}(X_{t}=j,T=t)=\sigma_{j}(t)=r_{t}\pi_{j}$ |

and so is a strong stationary time with $P_{\mu}(T=t)=r_{t}$. One can now verify inductively that

$P_{\mu}(X_{t}\in\cdot)=\theta(t)+P_{\mu}(T\leq t-1)\ \cdot pi$ |

and so the separation is

${\rm sep}_{\mu}(t)=1-\min_{j}\frac{P_{\mu}(X_{t}=j)}{\pi_{j}}=P_{\mu}(T\geq t)% -r_{t}=P_{\mu}(T>t).$ |

For comparison with the other two results, we stated Theorem 9.3 in terms of stopping times at which the stationary distribution is attained, but the underlying result (amplified as Theorem 9.4) holds for an arbitrary target distribution $\rho$. So fix $\rho$ as well as the initial distribution $\mu$. Call a stopping time $T$ admissible if $P_{\mu}(X_{T}\in\cdot)=\rho$. Write $\bar{t}(\mu,\sigma)$ for the inf of $E_{\mu}T$ over all admissible stopping times $T$.

(a) $\bar{t}(\mu,\sigma)=\max_{j}(E_{\mu}T_{j}-E_{\rho}T_{j})$.

(b) The “filling scheme” below defines an admissible stopping time such that $E_{\mu}T=\bar{t}(\mu,\sigma)$.

(c) Any admissible stopping time $T$ with the property

$\exists\ k\mbox{ such that }P_{\mu}(T\leq T_{k})=1.$ | (9.10) |

satisfies $E_{\mu}T=\bar{t}(\mu,\sigma)$.

Part (c) is rather remarkable, and can be rephrased as follows. Call a state $k$ with property (9.10) a halting state for the stopping time $T$. In words, the chain must stop if and when it hits a halting state. Then part (c) asserts that, to verify that an admissible time $T$ attains the minimum $\bar{t}(\mu,\rho)$, it suffices to show that there exists some halting state. In the next section we shall see this is very useful in simple examples.

Proof. The greedy construction used here is called a filling scheme. Recall from (9.7) the definitions

$\theta_{j}(t)=P_{\mu}(X_{t}=j,T\geq t),\ \ \ \ \sigma_{j}(t)=P_{\mu}(X_{t}=j,T% =t).$ |

Write also $\Sigma_{j}(t)=P_{\mu}(X_{T}=j,T\leq t)$. We now define $(\theta(t),\sigma(t);t\geq 0)$ and the associated stopping time $\bar{T}$ inductively via (9.8) and

$\displaystyle\sigma_{j}(t)$ | $\displaystyle=$ | $\displaystyle 0\mbox{ if }\Sigma_{j}(t-1)=\rho_{j}$ | ||

$\displaystyle=$ | $\displaystyle\theta_{t}\mbox{ if }\Sigma_{j}(t-1)+\theta_{j}(t)\leq\rho_{j}$ | |||

$\displaystyle=$ | $\displaystyle\rho_{j}-\Sigma_{j}(t-1)\mbox{ otherwise. }$ |

In words, we stop at the current state ($j$, say) provided our “quota” $\rho_{j}$ for the chance of stopping at $j$ has not yet been filled. Clearly

$\Sigma_{j}(t)\leq\rho_{j}\ \forall j\ \forall t.$ | (9.11) |

We now claim that $\bar{T}$ satisfies property (9.10). To see this, consider

$t_{j}\equiv\min\{t:\Sigma_{j}(t)=\rho_{j}\}\leq\infty.$ |

Then (9.10) holds by construction for any $k$ such that $t_{k}=\max_{j}t_{j}\leq\infty$. In particular, $\bar{T}\leq T_{k}<\infty$ a.s. and then by (9.11) $P_{\mu}(X_{\bar{T}}\in\cdot)=\lim{t\rightarrow\infty}\Sigma(t)=\rho$. So $\bar{T}$ is an admissible stopping time.

Remark. Generically we expect $t_{j}=\infty$ for exactly one state $j$, though other possibilities may occur, e.g. in the presence of symmetry.

Now consider an arbitrary admissible stopping time $T$, and consider the associated occupation measure ${\bf x}=(x_{j})$:

$j$ |

We shall show

$x_{j}+\rho_{j}=\mu_{j}+\sum_{i}x_{i}p_{ij}\ \forall j.$ | (9.12) |

Indeed, by counting the number of visits during $0,1,\ldots,T-1,T$ in two ways,

$j$ |

Chapter 2 Lemma yyy showed the (intuitively obvious) fact

$i\rightarrow j$ |

So summing over $i$,

$j$ |

and (9.12) follows.

Write $\bar{{\bf x}}$ for the occupation measure associated with the stopping time $\bar{T}$ produced by the filling scheme. By (9.10), $\min_{k}\bar{x}_{k}=0$. If ${\bf x}$ and ${\bf x}^{\prime}$ are solutions of (9.12) then the difference ${\bf d}={\bf x}-{\bf x}^{\prime}$ satisfies ${\bf d}={\bf d}{\bf P}$ and so is a multiple of the stationary distribution $\pi$. In particular, if ${\bf x}$ is the occupation measure for some arbitrary admissible time $T$, then

${\bf x}\geq\bar{{\bf x}},\mbox{ with equality iff }\min_{k}x_{k}=0.$ |

Since $E_{\mu}T=\sum_{i}x_{i}$, we have established parts (b) and (c) of the theorem, and

$\bar{t}(\mu,\sigma)=\sum_{i}\bar{x}_{i}.$ |

To prove (a), choose a state $k$ such that $\bar{x}_{k}=0$, that is such that $\bar{T}\leq T_{k}$. Then $E_{\mu}T_{k}=E_{\mu}\bar{T}+E_{\rho}T_{k}$ and hence $\bar{t}(\mu,\sigma)\leq\max_{j}(E_{\mu}T_{j}-E_{\rho}T_{j})$. But for any admissible stopping time $T$ and any state $j$

$E_{\mu}T_{j}\leq E_{\mu}T+E_{\rho}T_{j}$ |

giving the reverse inequality $\bar{t}(\mu,\sigma)\geq\max_{j}(E_{\mu}T_{j}-E_{\rho}T_{j})$. $\Box$

The minimal strong stationary time has mean $\bar{t}(\mu,\pi)$, i.e. is mean-minimal amongst all not-necessarily-strong stationary times, iff there exists a state $k$ such that

$P_{\mu}(X_{t}=k)/\pi_{k}=\min_{j}P_{\mu}(X_{t}=j)/\pi_{j}\ \forall t.$ |

Proof. From the construction of the minimal strong stationary time, this is the condition for $k$ to be a halting state.

Patterns in coin-tossing.

Recall Chapter 2 Example yyy: $(X_{t})$ is the chain on the set $\{H,T\}^{n}$ of $n$-tuples $i=(i_{1},\ldots,i_{n})$. Start at some arbitrary initial state $j=(j_{1},\ldots,j_{n})$. Here the deterministic stopping time “$T=n$” is a strong stationary time. Now a state $k=(k_{1},\ldots,k_{n})$ will be a halting state provided it does not overlap $j$, that is provided there is no $1\leq u\leq n$ such that $(j_{u},\ldots,j_{n})=(k_{1},\ldots,k_{n+u-1})$. But the number of overlapping states is at most $1=2+2^{2}+\ldots+2^{n-1}=2^{n}-1$, so there exists a non-overlapping state, i.e. a halting state. So $ET$ attains the minimum ($=n$) of $\bar{t}(j,\pi)$ over all stationary times (and not just over all strong stationary times).

Top-to-random card shuffling.

Consider the following scheme for shuffling an $n$-card deck: the top card is removed, and inserted in one of the $n$ possible positions, chosen uniformly at random. Start in some arbitrary order. Let $T$ be the first time that the card which was originally second-from-bottom has reached the top of the deck. Then it is not hard to show (Diaconis [123] p. 177) that $T+1$ is a strong stationary time. Now any configuration in which the originally-bottom card is the top card will be a halting state, and so $T+1$ is mean-minimal over all stationary times. Here $E(T+1)=1+\sum_{m=2}^{n-1}\frac{n}{m}=n(h_{n}-1)$.

The winning streak chain.

In a series of games which you win or lose independently with chance $0<c<1$, let $\hat{X}_{t}$ be your current “winning streak”, i.e. the number of games won since your last loss. For fixed $n$, the truncated process $X_{t}=\min(X_{t},n-1)$ is the Markov chain on states $\{0,1,2,\ldots,n-1\}$ with transition probabilities

$p(i,0)=1-c,\ \ p(i,\min(i+1,n-1))=c;\ 0\leq i\leq n-1$ |

and stationary distribution

$\pi_{i}=(1-c)c^{i},\ 0\leq i\leq n-2;\ \ \ \pi_{n-1}=c^{n-1}.$ |

We present this chain, started at $0$, as an example where it is easy to see there are different mean-minimal stationary times $T$. We’ll leave the simplest construction until last – can you guess it now? First consider $T_{J}$, where $J$ has the stationary distribution. This is a stationary time, and $n-1$ is a halting state, so it is mean-minimal. Now it is easy to show

$E_{0}T_{j}=\frac{1}{(1-c)c^{j}}\ -\frac{1}{1-c},1\leq j\leq n-1.$ |

(Slick proof: in the not-truncated chain,
Chapter 2 Lemma yyy says

$j$.)
So

$\bar{t}(0,\pi)=E_{0}T_{J}=\sum_{j\geq 1}\pi_{j}E_{0}T_{j}=n-2+\frac{\pi_{n-1}}% {(1-c)c^{n-1}}-\frac{1-\pi_{0}}{1-c}=n-1.$ |

Here is another stopping time $T$ which is easily checked to attain the stationary distribution, for the chain started at $0$. With chance $1-c$ stop at time $0$. Otherwise, run the chain until either hitting $n-1$ (in which case, stop) or returning to $0$. In the latter case, the return to $0$ occurs as a transition to $0$ from some state $M\geq 0$. Continue until first hitting $M+1$, then stop. Again $n-1$ is a halting state, so this stationary time also is mean-minimal. Of course, the simplest construction is the deterministic time $T=n-1$. This is a strong stationary time (the winning streak chain is a function of the patterns in coin tossing chain), and again $n-1$ is clearly a halting state. Thus $\bar{t}(0,\pi)=n-1$ without needing the calculation above.

Remark. One could alternatively use Corollary 9.5 to show that the strong stationary times in Examples 9.6 and 9.7 are mean-minimal stationary times. The previous examples are atypical: here is a more typical example in which the hypothesis of Corollary 9.5 is not satisfied and so no mean-optimal stationary time is a strong stationary time.

xxx needs a name!

Chapter 2 Example yyy can be rewritten as follows. Let $(U_{t})$ be independent uniform on $\{0,1,\ldots,n-1\}$ and let $(A_{t})$ be independent events with $P(A_{t})=a$. Define a chain $X$ on $\{0,1,\ldots,n-1\}$ by

$\displaystyle X_{t+1}$ | $\displaystyle=$ | $\displaystyle U_{t+1}\mbox{ on }A^{c}_{t}$ | ||

$\displaystyle=$ | $\displaystyle X_{t}+1\mbox{ mod }n\mbox{ on }A_{t}.$ |

The stationary distribution is the uniform distribution. Take $X_{0}=0$. Clearly $T\equiv\min\{t:A_{t}\mbox{ occurs }\}$ is a strong stationary time, and $ET=1/(1-a)$, and it is easy to see that $T$ is the minimal strong stationary time. But $T$ is not a mean-minimal stationary time. The occupation measure ${\bf x}$ associated with $T$ is such that $\min_{j}x_{j}=x_{n-1}=a^{n-1}+a^{2n-1}+\ldots=a^{n-1}/(1-a^{n})$, and so the occupation measure $\bar{{\bf x}}$ associated with a mean-minimal stationary time is $\bar{{\bf x}}={\bf x}-\frac{a^{n-1}}{1-a^{n}}\pi$, and so $\bar{t}(0,\pi)=\frac{1}{1-a}-\frac{a^{n-1}}{1-a^{n}}$.

9 A Second Look at General Markov Chains (April 21, 1995)Bibliography9.2 Markov chains and spanning trees

Generated on Mon Jun 2 14:23:48 2014 by LaTeXML