2 General Markov Chains (September 10, 1999)

2.9 Notes on Chapter 2.

Textbooks on Markov chains.

It is easy to write books on …or finite Markov chains, or on any of the other well-understood topics for which no further expositions are needed. G.-C. Rota

Your search for the Subject: MARKOV PROCESSES

retrieved 273 records
. U.C. Berkeley Library book catalog, September 1999.

Almost every introductory textbook on stochastic processes has a chapter or two about Markov chains: among the best are Karlin-Taylor [208, 209], Grimmett-Stirzaker [176] and, slightly more advanced, Asmussen [35]. In addition to Norris [270] there are several other undergraduate-level textbooks entirely or mostly devoted to Markov chains: Adke-Manjanuth [1], Hunter [186], Iosifescu [189], Isaacson-Madsen [191], Kemeny-Snell [215], Romanovsky [297]. At the graduate level, Durrett [133] has a concise chapter on the modern approach to the basic limit theory. Several more advanced texts which overlap our material were mentioned in Chapter 1 section 2.3 (yyy 7/20/99 version); other texts are Freedman [154], Anderson [31], and the treatize of Syski [318] on hitting times. Most textbooks leave an exaggerated impression of the difference between discrete- and continuous-time chains.

Section 2.1.2.Continuized is an ugly neologism, but no-one has collected my $5 prize for suggesting a better name!

Section 2.2. Elementary matrix treatments of results like those in section 2.2.2 for finite state space can be found in [186, 215]. On more general spaces, this is part of recurrent potential theory: see [96, 214] for the countable-state setting and Revuz [289] for continuous space. Our treatment, somewhat novel at the textbook level, Pitman [283] studied occupation measure identities more general than thos in section 2.2.1 and their applications to hitting time formulas, and we follow his approach in sectionMHTF. We are being slightly dishonest in treating Lemmas 2.5 and 2.6 this way, because these facts figure in the “right” proof of the ergodic theorems we use. We made a special effort not to abbreviate “number of visits to jj before time SS” as Nj(S)N_{j}(S), which forces the reader to decode formulas.

Kemeny and Snell [215] call 𝐙+Π{\bf Z}+\Pi the fundamental matrix, and use (EiTj+)(E_{i}T^{+}_{j}) rather than (EiTj)(E_{i}T_{j}) as the matrix of mean hitting times. Our set-up seems a little smoother – cf. Meyer [202] who calls 𝐙{\bf Z} the group inverse of 𝐈-𝐏{\rm{\bf I}}-{\bf P}.

The name “random target lemma” for Corollary 2.14 was coined by Lovász and Winkler [240]; the result itself is classical ([215] Theorem 4.4.10).

Open Problem 2.34

Portmanteau theorem for occupation times.

Can the results of section 2.2.2 be formulated as a single theorem? To explain the goal by analogy, consider the use [194] of Feynman diagrams to calculate quantities such as E(A3BC2)E(A^{3}BC^{2}) for dependent mean-zero Gaussian (A,B,C)(A,B,C). One rewrites the expectation as Ei=16ξiE\prod_{i=1}^{6}\xi_{i} for ξ1=ξ2=ξ3=A;ξ4=B,ξ5=ξ6=C\xi_{1}=\xi_{2}=\xi_{3}=A;\xi_{4}=B,\xi_{5}=\xi_{6}=C, and then applies the formula

Ei=16ξi=Mν(M)E\prod_{i=1}^{6}\xi_{i}=\sum_{M}\nu(M)

where the sum is over matchings M={{u1,v1},{u2,v2},{u3,v3}}M=\{\{u_{1},v_{1}\},\{u_{2},v_{2}\},\{u_{3},v_{3}\}\} of {1,2,3,4,5,6}\{1,2,3,4,5,6\} and where

ν(M)=j=13E(ξujξvj).\nu(M)=\prod_{j=1}^{3}E(\xi_{u_{j}}\xi_{v_{j}}).

By analogy, we seek a general rule which associates an expression like

Ei(number of visits to jj before time min(Tk,Tl)\min(T_{k},T_{l}))E_{i}(\mbox{number of visits to $j$ before time $\min(T_{k},T_{l})$})

with a combinatorial structure involving {i,j,k,l}\{i,j,k,l\}; then associates with the combinatorial structure some function of variables {pv,zvw,  v,w{i,j,k,l}}\{p_{v},z_{vw},\ v,w\in\{i,j,k,l\}\}; then shows that the value of the expression applied to a finite Markov chain equals the function of {πv,Zvw,  v,w{i,j,k,l}}\{\pi_{v},Z_{vw},\ v,w\in\{i,j,k,l\}\}.

Section 2.4.1. Corollary 2.21 and variants are the basis for the theory of positive-recurrent chains on continuous spaces: see [133] section 5.6 and Meyn and Tweedie [263].

Section 2.4.2. The fact that ||θ(t)-π||2||\theta(t)-\pi||_{2} is decreasing is a special case (H(u)=u2H(u)=u^{2}) of the following result (e.g. [213] Theorem 1.6).

Lemma 2.35

Let H:[0,)[0,)H:[0,\infty)\rightarrow[0,\infty) be concave [convex]. Let θ(t)\theta(t) be the distribution of an irreducible chain with stationary distribution π\pi. Then iπiH(θi(t)/πi)\sum_{i}\pi_{i}H(\theta_{i}(t)/\pi_{i}) is increasing [decreasing].

Section 2.6.Matthews [257, 258] introduced his method (Theorem 2.26) to study some highly symmetric walks (cf. Chapter 7) and to study some continuous-space Brownian motion covering problems.

Section 2.7. A more sophisticated notion is “the chain conditioned never to hit AA”, which can be formalized using Perron-Frobenius theory.

Section 2.8.1. Applying the optional stopping theorem involves checking side conditions (involving integrability of the martingale or the stopping time), but these are trivially satisfied in our applications.

Numerical methods. In many applications of non-reversible chains, e.g. to queueing-type processes, one must resort to numerical computations of the stationary distribution: see Stewart [314]. We don’t discuss such issues because in the reversible case we have conceptually simple expressions for the stationary distribution,

Matrix methods. There is a curious dichotomy between textbooks on Markov chains which use matrix theory almost everywhere and textbooks which use matrix theory almost nowhere. Our style is close to the latter; matrix formalism obscures more than it reveals. For our purposes, the one piece of matrix theory which is really essential is the spectral decomposition of reversible transition matrices in Chapter 3. Secondarily useful is the theory surrounding the Perron-Frobenius theorem, quoted for reversible chains in Chapter 3 section 6.5. (yyy 9/2/94 version)

yyy move both subsections to Chapter 8 “A Second Look …”.