3 Reversible Markov Chains (September 10, 2002)

3.1 Introduction

Recall 𝐏{\bf P} denotes the transition matrix and π\pi the stationary distribution of a finite irreducible discrete-time chain (Xt)(X_{t}). Call the chain reversible if

πipij=πjpji  for all  i,j.\pi_{i}p_{ij}=\pi_{j}p_{ji}\mbox{\ \ for all \ }i,j. (3.1)

Equivalently, suppose (for given irreducible 𝐏{\bf P}) that π\pi is a probability distribution satisfying (3.1). Then π\pi is the unique stationary distribution and the chain is reversible. This is true because (3.1), sometimes called the detailed balance equations, implies

iπipij=πjipji=πj  for all j\sum_{i}\pi_{i}p_{ij}=\pi_{j}\sum_{i}p_{ji}=\pi_{j}\mbox{\ \ for all\ }j

and therefore π\pi satisfies the balance equations of (1) in Chapter 2. margin: 9/10/99 version

The name reversible comes from the following fact. If (Xt)(X_{t}) is the stationary chain, that is, if X0X_{0} has distribution π\pi, then

(X0,X1,,Xt)=d(Xt,Xt-1,,X0).(X_{0},X_{1},\ldots,X_{t})\ \stackrel{d}{=}\ (X_{t},X_{t-1},\ldots,X_{0}).

More vividly, given a movie of the chain run forwards and the same movie run backwards, you cannot tell which is which.

It is elementary that the same symmetry property (3.1) holds for the tt-step transition matrix 𝐏t{\bf P}^{t}:

πipi,j(t)=πjpji(t)\pi_{i}p^{(t)}_{i,j}=\pi_{j}p^{(t)}_{ji}

and thence for the matrix 𝐙{\bf Z} of (6) in Chapter 2: margin: 9/10/99 version

πiZij=πjZji.\pi_{i}Z_{ij}=\pi_{j}Z_{ji}. (3.2)

But beware that the symmetry property does not work for mean hitting times: the assertion

πiEiTj=πjEjTi\pi_{i}E_{i}T_{j}=\pi_{j}E_{j}T_{i}

is definitely false in general (see the Notes for one intuitive explanation). See Chapter 7 margin: 1/31/94 versionfor further discussion. The following general lemma margin: The lemma has been copied here from Section 1.2 of Chapter 7 (1/31/94 version); reminder: it still needs to be deleted there! will be useful there.

Lemma 3.1

For an irreducible reversible chain, the following are equivalent.

(a) Pi(Xt=i)=Pj(Xt=j),i,jI,t1P_{i}(X_{t}=i)=P_{j}(X_{t}=j),\ \ i,j\in I,\ \ t\geq 1

(b) Pi(Tj=t)=Pj(Ti=t),i,jI,t1P_{i}(T_{j}=t)=P_{j}(T_{i}=t),\ \ i,j\in I,\ \ t\geq 1.

Proof. In either case the stationary distribution is uniform—under (a) by letting tt\rightarrow\infty, and under (b) by taking t=1t=1, implying pijpjip_{ij}\equiv p_{ji}. So by reversibility Pi(Xt=j)=Pj(Xt=i)P_{i}(X_{t}=j)=P_{j}(X_{t}=i) for iji\neq j and t1t\geq 1. But recall from Chapter 2 margin: 9/10/99 versionLemma 25 that the generating functions

Gij(z):=tPi(Xt=j)zj,Fij(z):=tPi(Tt=j)zjG_{ij}(z):=\sum_{t}P_{i}(X_{t}=j)z^{j},\qquad F_{ij}(z):=\sum_{t}P_{i}(T_{t}=j% )z^{j}

satisfy

Fij=Gij/Gjj.F_{ij}=G_{ij}/G_{jj}. (3.3)

For iji\neq j we have seen that Gij=GjiG_{ij}=G_{ji}, and hence by (3.3)

Fij=Fji  iff  Gjj=Gii,F_{ij}=F_{ji}\mbox{\ \ iff\ \ }G_{jj}=G_{ii},

which is the assertion of Lemma 3.1.    

The discussion above extends to continuous time with only notational changes, e.g., the detailed balance equation (3.1) becomes

πiqij=πjqji  for all i,j.\pi_{i}q_{ij}=\pi_{j}q_{ji}\mbox{\ \ for all\ }i,j. (3.4)

3.1.1 Time-reversals and cat-and-mouse games

For a general chain we can define the time-reversed chain to have transition matrix 𝐏*{\bf P}^{*} where

πipij=πjpji*\pi_{i}p_{ij}=\pi_{j}p^{*}_{ji}

so that the chain is reversible iff 𝐏*=𝐏{\bf P}^{*}={\bf P}. One can check [cf. (3.2)]

πiZij=πjZji*.\pi_{i}Z_{ij}=\pi_{j}Z^{*}_{ji}. (3.5)

The stationary 𝐏*{\bf P}^{*}-chain is just the stationary 𝐏{\bf P}-chain run backwards in time. Consider Examples 16 and 22 from Chapter 2. margin: 9/10/99 versionIn Example 16 (patterns in coin tossing) the time-reversal 𝐏*{\bf P}^{*} just “shifts left” instead of shifting right, i.e., from HTTTTHTTTT the possible transitions are to HHTTTHHTTT and THTTTTHTTT. In Example 22 the time-reversal just reverses the direction of motion around the nn-cycle:

pij*=a1(j=i-1)+1-an.p^{*}_{ij}=a1_{(j=i-1)}\ +\ \frac{1-a}{n}.

Warning. These examples are simple because the stationary distributions are uniform. If the stationary distribution has no simple form then typically 𝐏*{\bf P}^{*} will have no simple form.

A few facts about reversible chains are really specializations of facts about general chains which involve both 𝐏{\bf P} and 𝐏*{\bf P}^{*}. Here is a simple instance.

Lemma 3.2 (The cyclic tour property)

For states i0,i1,,imi_{0},i_{1},\ldots,i_{m} of a reversible chain,

Ei0Ti1+Ei1Ti2++EimTi0=Ei0Tim+EimTim-1++Ei1Ti0.E_{i_{0}}T_{i_{1}}+E_{i_{1}}T_{i_{2}}+\cdots+E_{i_{m}}T_{i_{0}}=E_{i_{0}}T_{i_% {m}}+E_{i_{m}}T_{i_{m-1}}+\cdots+E_{i_{1}}T_{i_{0}}.

The explanation is that in a general chain we have

Ei0Ti1+Ei1Ti2++EimTi0=Ei0*Tim+Eim*Tim-1++Ei1*Ti0E_{i_{0}}T_{i_{1}}+E_{i_{1}}T_{i_{2}}+\cdots+E_{i_{m}}T_{i_{0}}=E^{*}_{i_{0}}T% _{i_{m}}+E^{*}_{i_{m}}T_{i_{m-1}}+\cdots+E^{*}_{i_{1}}T_{i_{0}} (3.6)

where E*E^{*} refers to the time-reversed chain 𝐏*{\bf P}^{*}. Equality (3.6) is intuitively obvious when we visualize running a movie backwards. But a precise argument requires a little sophistication (see Notes). It is however straightforward to verify (3.6) using (3.5) and the mean hitting time formula EiTj=(Zjj-Zij)/πjE_{i}T_{j}=(Z_{jj}-Z_{ij})/\pi_{j}.

We shall encounter several results which have amusing interpretations as cat-and-mouse games. The common feature of these games is that the cat moves according to a transition matrix 𝐏{\bf P} and the mouse moves according to the time-reversed transition matrix 𝐏*{\bf P}^{*}.

Cat-and-mouse game 1. Both animals are placed at the same state, chosen according to the stationary distribution. The mouse makes a jump according to 𝐏*{\bf P}^{*}, and then stops. The cat starts moving according to 𝐏{\bf P} and continues until it finds the mouse, after MM steps.

The notable feature of this game is the simple formula for EMEM:

EM=n-1, where nn is the number of states.EM=n-1,\mbox{\ where~{}$n$ is the number of states.} (3.7)

This is simple once you see the right picture. Consider the stationary 𝐏{\bf P}-chain (X0,X1,X2,)(X_{0},X_{1},X_{2},\ldots). We can specify the game in terms of that chain by taking the initial state to be X1X_{1}, and the mouse’s jump to be to X0X_{0}, and the cat’s moves to be to X2,X3,X_{2},X_{3},\ldots. So M=T+-1M=T^{+}-1 with

T+:=min{t1:Xt=X0}.T^{+}:=\min\{t\geq 1:X_{t}=X_{0}\}.

And ET+=iπiEiTi+=iπi1πi=nET^{+}=\sum_{i}\pi_{i}E_{i}T^{+}_{i}=\sum_{i}\pi_{i}\frac{1}{\pi_{i}}=n.

Cat-and-mouse game 2. This game, and Proposition 3.3, are rephrasings of results of Coppersmith et al [101]. Think of the cat and the mouse as pieces in a solitaire board game. The player sees their positions and chooses which one to move: if the cat is chosen, it makes a move according to 𝐏{\bf P}, and if the mouse is chosen, it makes a move according to 𝐏*{\bf P}^{*}. Let MM denote the number of moves until the cat and mouse meet. Then one expects the mean E(x,y)ME_{(x,y)}M to depend on the initial positions (x,y)(x,y) of (cat, mouse) and on the player’s strategy. But consider the example of asymmetric random walk on a nn-cycle, with (say) chance 2/32/3 of moving clockwise and chance 1/31/3 of moving counterclockwise. A moment’s thought reveals that the distance (measured clockwise from cat to mouse) between the animals does not depend on the player’s strategy, and hence neither does E(x,y)ME_{(x,y)}M. In general EMEM does depend on the strategy, but the following result implies that the size of the effect of strategy changes can be bounded in terms of a measure of non-symmetry of the chain.

Proposition 3.3

Regardless of strategy,

minzEπTzE(x,y)M-(ExTy-EπTy)maxzEπTz\min_{z}E_{\pi}T_{z}\leq E_{(x,y)}M-(E_{x}T_{y}-E_{\pi}T_{y})\leq\max_{z}E_{% \pi}T_{z}

where hitting times TT refer to the 𝐏{\bf P}-chain.

Proof. Consider the functions margin: Symbol used here is “defined identically to be”.

f(x,y):ExTy-EπTyf(x,y):\equiv E_{x}T_{y}-E_{\pi}T_{y}
f*(y,x):Ey*Tx-Eπ*Tx.f^{*}(y,x):\equiv E^{*}_{y}T_{x}-E^{*}_{\pi}T_{x}.

The first-step recurrences for xExTyx\mapsto E_{x}T_{y} and yEy*Txy\mapsto E^{*}_{y}T_{x} give

f(x,y)\displaystyle f(x,y) =\displaystyle= 1+zpxzf(z,y),  yx\displaystyle 1+\sum_{z}p_{xz}f(z,y),\ \ \ \,y\neq x (3.8)
f*(y,x)\displaystyle f^{*}(y,x) =\displaystyle= 1+zpyz*f*(z,x), yx.\displaystyle 1+\sum_{z}p^{*}_{yz}f^{*}(z,x),\ \ y\neq x. (3.9)

By the mean hitting time formula

f(x,y)=-Zxyπy=-Zyx*πx=f*(y,x)f(x,y)=\frac{-Z_{xy}}{\pi_{y}}=\frac{-Z^{*}_{yx}}{\pi_{x}}=f^{*}(y,x)

so we may rewrite (3.9) as

f(x,y)=1+zpyz*f(x,z), yx.f(x,y)=1+\sum_{z}p^{*}_{yz}f(x,z),\ \ y\neq x. (3.10)

Now let (X^t,Y^t)(\hat{X}_{t},\hat{Y}_{t}) be the positions of (cat, mouse) after tt moves according to some strategy. Consider

Wtt+f(X^t,Y^t).W_{t}\equiv t+f(\hat{X}_{t},\hat{Y}_{t}).

Equalities (3.8) and (3.10) are exactly what is needed to verify

(Wt;0tM) is a martingale.(W_{t};0\leq t\leq M)\mbox{\ is a martingale}.

So the optional stopping theorem says E(x,y)W0=E(x,y)WME_{(x,y)}W_{0}=E_{(x,y)}W_{M}, that is,

f(x,y)=E(x,y)M+E(x,y)f(X^M,Y^M).f(x,y)=E_{(x,y)}M+E_{(x,y)}f(\hat{X}_{M},\hat{Y}_{M}). (3.11)

But X^M=Y^M\hat{X}_{M}=\hat{Y}_{M} and -f(z,z)=EπTz-f(z,z)=E_{\pi}T_{z}, so

minzEπTz-f(X^M,Y^M)maxzEπTz\min_{z}E_{\pi}T_{z}\leq-f(\hat{X}_{M},\hat{Y}_{M})\leq\max_{z}E_{\pi}T_{z}

and the result follows from (3.11).    

Remarks. Symmetry conditions in the reversible setting are discussed in Chapter 7. margin: 1/31/94 versionVertex-transitivity forces EπTzE_{\pi}T_{z} to be independent of zz, and hence in the present setting implies E(x,y)M=ExTyE_{(x,y)}M=E_{x}T_{y} regardless of strategy. For a reversible chain without this symmetry condition, consider (x0,y0)(x_{0},y_{0}) attaining the min and max of EπTzE_{\pi}T_{z}. The Proposition then implies Ey0Tx0E(x0,y0)MEx0Ty0E_{y_{0}}T_{x_{0}}\leq E_{(x_{0},y_{0})}M\leq E_{x_{0}}T_{y_{0}} and the bounds are attained by keeping one animal fixed. But for general initial states the bounds of the Proposition are not attained. Indeed, the proof shows that to attain the bounds we need a strategy which forces the animals to meet at states attaining the extrema of EπTzE_{\pi}T_{z}. Finally, in the setting of random walk on a nn-vertex graph we can combine Proposition 3.3 with mean hitting time bounds from Chapter 6 margin: 10/31/94 versionto show that EMEM is at worst O(n3)O(n^{3}).

3.1.2 Entrywise ordered transition matrices

Recall from Chapter 2 margin: 9/10/99 versionSection 3 that for a function f:SRf:S\to R with iπifi=0\sum_{i}\pi_{i}f_{i}=0, the asymptotic variance rate margin: This subsection adapted from Section 8.1 of Chapter MCMC (1/8/01 version); reminder: it still needs to be deleted there! is

σ2(𝐏,f):=limtt-1vars=1tf(Xs)=fΓf\sigma^{2}({\bf P},f):=\lim_{t}t^{-1}{\rm var}\ \sum_{s=1}^{t}f(X_{s})=f\Gamma f (3.12)

where Γij=πiZij+πjZji+πiπj-πiδij\Gamma_{ij}=\pi_{i}Z_{ij}+\pi_{j}Z_{ji}+\pi_{i}\pi_{j}-\pi_{i}\delta_{ij}. These individual-function variance rates can be compared between chains with the same stationary distribution, under a very strong “(off-diagonal) entrywise ordering” of reversible transition matrices.

Lemma 3.4 (Peskun’s Lemma [280])

Let 𝐏(1){\bf P}^{(1)} and 𝐏(2){\bf P}^{(2)} be reversible with the same stationary distribution π\pi. Suppose pij(1)pij(2)p^{(1)}_{ij}\leq p^{(2)}_{ij} for all jij\neq i. Then σ2(𝐏(1),f)σ2(𝐏(2),f)\sigma^{2}({\bf P}^{(1)},f)\geq\sigma^{2}({\bf P}^{(2)},f) for all ff with iπifi=0\sum_{i}\pi_{i}f_{i}=0.

Proof. Introduce a parameter 0α10\leq\alpha\leq 1 and write

𝐏=𝐏(α):=(1-α)𝐏(1)+α𝐏(2).{\bf P}={\bf P}(\alpha):=(1-\alpha){\bf P}^{(1)}+\alpha{\bf P}^{(2)}.

Write ()(\cdot)^{\prime} for ddα()\frac{d}{d\alpha}(\cdot). It is enough to show

(σ2(𝐏,f))0.(\sigma^{2}({\bf P},f))^{\prime}\leq 0.

By (3.12)

(σ2(𝐏,f))=fΓf=2ijfiπiZijfj.(\sigma^{2}({\bf P},f))^{\prime}=f\Gamma^{\prime}f=2\sum_{i}\sum_{j}f_{i}\pi_{% i}Z^{\prime}_{ij}f_{j}.
margin: Need to decide where to put statement and proof of that lemma

By Chapter MCMC margin: 1/8/01 versionLemma 4, 𝐙=𝐙𝐏𝐙{\bf Z}^{\prime}={\bf Z}{\bf P}^{\prime}{\bf Z}. By setting

gi:=πifi; aij:=Zij/πj; wij:=πipijg_{i}:=\pi_{i}f_{i};\quad a_{ij}:=Z_{ij}/\pi_{j};\quad w_{ij}:=\pi_{i}p_{ij}

we find 𝐀=𝐀𝐖𝐀{\bf A}^{\prime}={\bf A}{\bf W}^{\prime}{\bf A} and can rewrite the equality above as

(σ2(𝐏,f))=2g𝐀𝐖𝐀g.(\sigma^{2}({\bf P},f))^{\prime}=2\ g{\bf A}{\bf W}^{\prime}{\bf A}g.

Since 𝐀{\bf A} is symmetric, it is enough to show that 𝐖{\bf W}^{\prime} is margin: agreed: use negative semidefinite rather than nonpositive definite throughout negative semidefinite. By hypothesis 𝐖{\bf W}^{\prime} is symmetric with zero row-sums and wij0w^{\prime}_{ij}\geq 0 for jij\neq i. Ordering states arbitrarily, we may write

𝐖=i,j:i<jwij𝐌ij{\bf W}^{\prime}=\sum_{i,j:\,i<j}w^{\prime}_{ij}{\bf M}^{ij}

where 𝐌ij{\bf M}^{ij} is the matrix whose only nonzero entries are m(i,i)=m(j,j)=-1m(i,i)=m(j,j)=-1 and m(i,j)=m(j,i)=1m(i,j)=m(j,i)=1. Plainly 𝐌ij{\bf M}^{ij} is negative semidefinite, hence so is 𝐖{\bf W}^{\prime}.