3 Reversible Markov Chains (September 10, 2002)3.1 Introduction 3.3 Electrical networks

3.2 Reversible chains and weighted graphs

Our convention is that a graph has finite vertex-set $\mbox{${\cal V}$}=\{v,x,y,\ldots\}$ and edge-set $\mbox{${\cal E}$}=\{e_{1},e_{2},\ldots\}$ , is connected and undirected, has no multiple edges, and has no self-loops. In a weighted graph, each edge $(v,w)$ also has a weight $0<w_{v,x}=w_{x,v}<\infty$ , and we allow a weighted graph to have self-loops.

Given a weighted graph, there is a natural definition of a Markov chain on the vertices. This requires an arbitrary choice of convention: do we want to regard an absent edge as having weight $0$ or weight $+\infty$ ? In terms of electrical networks (Section 3.3) the question is whether to regard weights as conductances or as resistances of wires. Conceptually one can make good arguments for either choice, but formulas look simpler with the conductance convention (absent edges have weight $0$ ), so we’ll adopt that convention. Define discrete-time random walk on a weighted graph to be the Markov chain with transition matrix

p_{vx}:=w_{vx}/w_{v},\ x\neq v

(3.13)

where

w_{v}:=\sum_{x}w_{vx},\ \ w:=\sum_{v}w_{v}.

Note that $w$ is the total edge-weight, when each edge is counted twice, i.e., once in each direction. The fundamental fact is that this chain is automatically reversible with stationary distribution

\pi_{v}\equiv w_{v}/w

(3.14)

because (3.1) is obviously satisfied by $\pi_{v}p_{vx}=\pi_{x}p_{xv}=w_{vx}/w$ . Our standing convention that graphs be connected implies that the chain is irreducible. Conversely, with our standing convention that chains be irreducible, any reversible chain can be regarded as as random walk on the weighted graph with edge-weights $w_{vx}:=\pi_{v}p_{vx}$ . Note also that the “aperiodic” condition for a Markov chain (occurring in the convergence theorem ^†^†margin: 9/10/99 versionChapter 2 Theorem 2) is just the condition that the graph be not bipartite.

An unweighted graph can be fitted into this setup by simply assigning weight $1$ to each edge. Since we’ll be talking a lot about this case, let’s write out the specialization explicitly. The transition matrix becomes

(v,x)

where $d_{v}$ is the degree of vertex $v$ . The stationary distribution becomes

\pi_{v}=\frac{d_{v}}{2|\mbox{${\cal E}$}|}

(3.15)

where $|\mbox{${\cal E}$}|$ is the number of edges of the graph. In particular, on an unweighted regular graph the stationary distribution is uniform.

In continuous time there are two different ways to associate a walk with a weighted or unweighted graph. One way (and we use this way unless otherwise mentioned) is just to use (3.13) as the definition of the transition rates $q_{vx}$ . In the language of Chapter 2 ^†^†margin: 9/10/99 versionthis is the continuization of the discrete-time walk, and has the same stationary distribution and mean hitting times as the discrete-time walk. The alternative definition, which we call the fluid model, uses the weights directly as transition rates:

q_{vx}:=w_{vx},\ \ x\neq v.

(3.16)

In this model the stationary distribution is always uniform (cf. Section 3.2.1). In the case of an unweighted regular graph the two models are identical up to a deterministic time rescaling, but for non-regular graphs there are typically no exact relations between numerical quantities for the two continuous-time models. Note that, given an arbitrary continuous-time reversible chain, we can define edge-weights $(w_{ij})$ via

\pi_{i}q_{ij}=\pi_{j}q_{ji}=w_{ij},\mbox{\ say}

but the weights $(w_{ij})$ do not completely determine the chain: we can specify the $\pi_{i}$ independently and then solve for the $q$ ’s.

Though there’s no point in writing out all the specializations of the general theory of ^†^†margin: 9/10/99 versionChapter 2, let us emphasize the simple expressions for mean return times of discrete-time walk obtained from ^†^†margin: 9/10/99 versionChapter 2 Lemma 5 and the expressions (3.14)–(3.15) for the stationary distribution.

Lemma 3.5

For random walk on an $n$ -vertex graph,

$\displaystyle E_{v}T^{+}_{v}$	$\displaystyle=$	$\displaystyle\frac{w}{w_{v}}\quad\mbox{\ \,{\rm(}weighted\/{\rm)}}$
	$\displaystyle=$	$\displaystyle\frac{2\|\mbox{${\cal E}$}\|}{d_{v}}\quad\mbox{{\rm(}unweighted\/{% \rm)}}$
	$\displaystyle=$	$\displaystyle n\quad\mbox{\ \ \ \,{\rm(}unweighted regular\/{\rm)}}.$

Example 3.6

Chess ^†^†margin: The example has been copied here from the start of Example 18 of Chapter 5 (4/22/96 version); reminder: that example needs to be modified accordingly! moves.

Here is a classic homework problem for an undergraduate Markov chains course.

Start a knight at a corner square of an otherwise-empty chessboard. Move the knight at random, by choosing uniformly from the legal knight-moves at each step. What is the mean number of moves until the knight returns to the starting square?

It’s a good question, because if you don’t know Markov chain theory it looks too messy to do by hand, whereas using Markov chain theory it becomes very simple. The knight is performing random walk on a graph (the 64 squares are the vertices, and the possible knight-moves are the edges). It is not hard to check that the graph is connected, so by the elementary Lemma 3.5, for a corner square $v$ the mean return time is

E_{v}T^{+}_{v}=\frac{1}{\pi_{v}}=\frac{2|\mbox{${\cal E}$}|}{d_{v}}=|\mbox{${% \cal E}$}|,

and by drawing a sketch in the margin the reader can count the number of edges $|\mbox{${\cal E}$}|$ to be $168$ .

The following cute variation of Lemma 3.5 is sometimes useful. Given the discrete-time random walk $(X_{t})$ , consider the process

Z_{t}=(X_{t-1},X_{t})

recording the present position at time $t$ and also the previous position. Clearly $(Z_{t})$ is a Markov chain whose state-space is the set $\stackrel{\rightarrow}{{\cal E}}$ of directed edges, and its stationary distribution ( $\rho$ , say) is

\rho(v,x)=\frac{w_{vx}}{w}

in the general weighted case, and hence

\rho(v,x)=\frac{1}{|\stackrel{\rightarrow}{{\cal E}}|},\ \ (x,v)\in\stackrel{% \rightarrow}{{\cal E}}

in the unweighted case. Now given an edge $(x,v)$ , we can apply ^†^†margin: 9/10/99 versionChapter 2 Lemma 5 to $(Z_{t})$ and the state $(x,v)$ to deduce the following.

Lemma 3.7

Given an edge $(v,x)$ define

U:=\min\{t\geq 1:X_{t}=v,\ X_{t-1}=x\}.

Then

	$\displaystyle E_{v}U$	$\displaystyle=$	$\displaystyle\frac{w}{w_{vx}}\quad\mbox{{\rm(}weighted\/{\rm)}}$
		$\displaystyle=$	$\displaystyle 2\|\mbox{${\cal E}$}\|\quad\mbox{{\rm(}unweighted\/{\rm)}}.$

Corollary 3.8 (The edge-commute inequality)

For an edge $(v,x)$ ,

E_{v}T_{x}+E_{x}T_{v}\leq\begin{array}[]{l}\frac{w}{w_{vx}}\quad\mbox{{\rm(}% weighted\/{\rm)}}\\ 2|\mbox{${\cal E}$}|\quad\mbox{{\rm(}unweighted\/{\rm)}}\end{array}.

We shall soon see (Section 3.3.3) this inequality has a natural interpretation in terms of electrical resistance, but it is worth remembering that the result is more elementary than that.

Here is another variant of Lemma 3.5.

Lemma 3.9

For random walk on a weighted $n$ -vertex graph,

\sum_{e=(v,x)}w_{e}(E_{v}T_{x}+E_{x}T_{v})=w(n-1)

where the sum is over undirected edges.

Proof. Writing $\sum_{v}\sum_{x}$ for the sum over directed edges $(v,x)$ , the left side equals

			$\displaystyle\frac{1}{2}\sum_{v}\sum_{x}w_{vx}(E_{v}T_{x}+E_{x}T_{v})$
		$\displaystyle=$	$\displaystyle\sum_{v}\sum_{x}w_{vx}E_{x}T_{v}\mbox{\ \ by symmetry}$
		$\displaystyle=$	$\displaystyle w\sum_{v}\pi_{v}\sum_{x}p_{vx}E_{x}T_{v}$
		$\displaystyle=$	$\displaystyle w\sum_{v}\pi_{v}(E_{v}T^{+}_{v}-1)$
		$\displaystyle=$	$\displaystyle w\sum_{v}\pi_{v}({\textstyle\frac{1}{\pi_{v}}}-1)$
		$\displaystyle=$	$\displaystyle w(n-1).\qquad\qquad\qquad~{}\ \ \rule{4.3pt}{4.3pt}$

3.2.1 The fluid model

Imagine a finite number of identical buckets which can hold unit quantity of fluid. Some pairs of buckets are connected by tubes through their bottoms. If a tube connects buckets $i$ and $j$ then, when the quantities of fluid in buckets $i$ and $j$ are $p_{i}$ and $p_{j}$ , the flow rate through the tube should be proportional to the pressure difference and hence should be $w_{ij}(p_{i}-p_{j})$ in the direction $i\rightarrow j$ , where $w_{ij}=w_{ji}$ is a parameter. Neglecting the fluid in the tubes, the quantities of fluid $(p_{i}(t))$ at time $t$ will evolve according to the differential equations

\frac{dp_{j}(t)}{dt}=\sum_{i\neq j}w_{ij}(p_{i}(t)-p_{j}(t)).

These of course are the same equations as the forward equations [(4) of ^†^†margin: 9/10/99 versionChapter 2] for $p_{i}(t)$ (the probability of being in state $i$ at time $t$ ) for the continuous-time chain with transition rates $q_{ij}=w_{ij},\ j\neq i$ . Hence we call this particular way of defining a continuous-time chain in terms of a weighted graph the fluid model. Our main purpose in mentioning this notion is to distinguish it from the electrical network analogy in the next section. Our intuition about fluids says that as $t\rightarrow\infty$ the fluid will distribute itself uniformly amongst buckets, which corresponds to the elementary fact that the stationary distribution of the “fluid model” chain is always uniform. Our intuition also says that increasing a “specific flow rate” parameter $w_{ij}$ will make the fluid settle faster, and this corresponds to a true fact about the “fluid model” Markov chain (in terms of the eigenvalue interpretation of asymptotic convergence rate—see Corollary 3.28). On the other hand the same assertion for the usual discrete-time chain or its continuization is simply false.

3.1 Introduction Bibliography 3.3 Electrical networks

Generated on Mon Jun 2 14:23:48 2014 by LaTeXML [LOGO]