2 General Markov Chains (September 10, 1999)2.5 Distributional identities 2.7 New chains from old

2.6 Matthews’ method for cover times

Theorem 2.26 below is the only non-classical result in this Chapter. We make extensive use of this Matthews’ method in Chapter 6 to analyze cover times for random walks on graphs.

Consider the cover time $C:=\max_{j}T_{j}$ of the chain, i.e. the time required to visit every state. How can we bound $E_{i}C$ in terms of the mean hitting times $E_{i}T_{j}$ ? To appreciate the cleverness of Theorem 2.26 let us first consider a more routine argument. Write $t^{*}:=\max_{i.j}E_{i}T_{j}$ . Since $E_{i}C$ is unaffected by continuization, we may work in continuous time. By (2.20)

P_{i}(T_{j}>ket^{*})\leq e^{-k},\ k=1,2,3,\ldots.

By Boole’s inequality, for an $n$ -state chain

P_{i}(C>ket^{*})\leq ne^{-k},\ k=1,2,3,\ldots.

One can rewrite this successively as

	$\displaystyle P_{i}\left(\frac{C}{et^{*}}>x\right)$	$\displaystyle\leq$	$\displaystyle ne\cdot e^{-x},\quad 0\leq x<\infty$
	$\displaystyle P_{i}\left(\frac{C}{et^{*}}-\log(en)>x\right)$	$\displaystyle\leq$	$\displaystyle e^{-x},\quad 0\leq x<\infty.$

In words, this says that the distribution of $\frac{C}{et^{*}}-\log(en)$ is stochastically smaller that the exponential $(1)$ distribution, implying $E_{i}\left(\frac{C}{et^{*}}-\log(en)\right)\leq 1$ and hence

\max_{i}E_{i}C\leq(2+\log n)et^{*}.

This argument does lead to a bound, but one suspects the factors $2$ and $e$ are artifacts of the proof; also, it seems hard to obtain a lower bound this way. The following result both “cleans up” the upper bound and gives a lower bound.

Theorem 2.26 (Matthews [257])

For any $n$ -state Markov chain,

\max_{v}E_{v}C\leq h_{n-1}\max_{i,j}E_{i}T_{j}

\min_{v}E_{v}C\geq h_{n-1}\min_{i\neq j}E_{i}T_{j}

where $h_{n-1}:=\sum_{m=1}^{n-1}\frac{1}{m}$ .

Proof. We’ll prove the lower bound — the upper bound proof is identical. Let $J_{1},J_{2},\ldots,J_{n}$ be a uniform random ordering of the states, independent of the chain. Define $C_{m}:=\max_{i\leq m}T_{J_{i}}$ to be the time until all of $\{J_{1},J_{2},\ldots,J_{m}\}$ have been visited, in some order. The key identity is

E(C_{m}-C_{m-1}|J_{1},\ldots,J_{m};X_{t},t\leq C_{m-1})=t(L_{m-1},J_{m})1_{(L_% {m}=J_{m})}

(2.26)

where $t(i,j):=E_{i}T_{j}$ and

\{J_{1},J_{2},\ldots,J_{m}\}

To understand what this says, suppose we are told which are the states $\{J_{1},J_{2},\ldots,J_{m}\}$ and told the path of the chain up through time $C_{m-1}$ . Then we know whether or not $L_{m}=J_{m}$ : if not, then $C_{m}=C_{m-1}$ , and if so, then the conditional distribution of $C_{m}-C_{m-1}$ is the distribution of the time to hit $J_{m}$ from the state at time $C_{m-1}$ , which we are told is state $L_{m-1}$ .

Writing $t_{*}:=\min_{i\neq j}t(i,j)$ , the right side of (2.26) is $\geq t_{*}1_{(L_{m}=J_{m})}$ , and so taking expectations

E(C_{m}-C_{m-1})\geq t_{*}P(L_{m}=J_{m}).

But obviously $P(L_{m}=J_{m})=1/m$ by symmetry. So

E_{v}C=E_{v}C_{1}+\sum_{m=2}^{n}E_{v}(C_{m}-C_{m-1})\geq E_{v}C_{1}+t_{*}\sum_% {m=2}^{n}\frac{1}{m}.

Allowing for the possibility $J_{1}=v$ we see $E_{v}C_{1}\geq(1-\frac{1}{n})t_{*}$ , and the lower bound follows.

2.5 Distributional identities Bibliography 2.7 New chains from old

Generated on Mon Jun 2 14:23:48 2014 by LaTeXML [LOGO]