3 Reversible Markov Chains (September 10, 2002)

3.7 Extremal characterizations and mean hitting times

Theorem 3.36 (Extremal characterization of mean commute times)

For distinct states ii and aa, the mean commute time satisfies

EiTa+EaTi=sup{1/(g,g):0g1,  g(i)=1,  g(a)=0}E_{i}T_{a}+E_{a}T_{i}=\sup\{1/\mbox{${\cal E}$}(g,g):0\leq g\leq 1,\ g(i)=1,\ % g(a)=0\} (3.90)

and the sup is attained by g(j)=Pj(Ti<Ta)g(j)=P_{j}(T_{i}<T_{a}). In discrete time, for a subset AA and a state iAi\not\in A,

πiPi(TA<Ti+)=inf{(g,g):0g1,g(i)=1,g()=0 on AA}\pi_{i}P_{i}(T_{A}<T^{+}_{i})=\inf\{\mbox{${\cal E}$}(g,g):0\leq g\leq 1,\ g(i% )=1,\ g(\cdot)=0\mbox{\rm\ on~{}$A$}\} (3.91)

and the inf is attained by g(j)=Pj(Ti<TA)g(j)=P_{j}(T_{i}<T_{A}). Equation (3.91) remains true in continuous time, with πi\pi_{i} replaced by qiπiq_{i}\pi_{i} on the left.

Proof. As noted at (3.28), form (3.90) follows (in either discrete or continuous time) from form (3.91) with A={a}A=\{a\}. To prove (3.91), consider gg satisfying the specified boundary conditions. Inspecting (3.74), the contribution to (g,g)\mbox{${\cal E}$}(g,g) involving a fixed state jj is

kjπjpjk(g(k)-g(j))2.\sum_{k\neq j}\pi_{j}p_{jk}(g(k)-g(j))^{2}. (3.92)

As a function of g(j)g(j) this is minimized by

g(j)=kpjkg(k).g(j)=\sum_{k}p_{jk}g(k). (3.93)

Thus the gg which minimizes {\cal E} subject to the prescribed boundary conditions on A{i}A\cup\{i\} must satisfy (3.93) for all jA{i}j\not\in A\cup\{i\}, and by Chapter 2 margin: 9/10/99 versionLemma 27 the unique solution of these equations is g(j)=Pj(Ti<TA)g(j)=P_{j}(T_{i}<T_{A}). Now apply to this gg the general expression (3.75):

(g,g)=jπjg(j)(g(j)-kpjkg(k)).\mbox{${\cal E}$}(g,g)=\sum_{j}\pi_{j}g(j)\left(g(j)-\sum_{k}p_{jk}g(k)\right).

For jA{i}j\not\in A\cup\{i\} the factor (g(j)-kpjkg(k))(g(j)-\sum_{k}p_{jk}g(k)) equals zero, and for jAj\in A we have g(j)=0g(j)=0, so only the j=ij=i term contributes. Thus

(g,g)\displaystyle\mbox{${\cal E}$}(g,g) =\displaystyle= πi(1-kpikg(k))\displaystyle\pi_{i}\left(1-\sum_{k}p_{ik}g(k)\right) (3.94)
=\displaystyle= πi(1-Pi(Ti+<TA))\displaystyle\pi_{i}(1-P_{i}(T^{+}_{i}<T_{A}))
=\displaystyle= πiPi(TA<Ti+),\displaystyle\pi_{i}P_{i}(T_{A}<T^{+}_{i}),

giving (3.91).    

The analogous result for two disjoint subsets AA and BB is a little complicated to state. The argument above shows that

inf{(g,g):g()=0 on AA,  g()=1 on BB}\inf\{\mbox{${\cal E}$}(g,g):g(\cdot)=0\mbox{\ on~{}$A$},\ g(\cdot)=1\mbox{\ % on~{}$B$}\}

is attained by g0(j)=Pj(TB<TA)g_{0}(j)=P_{j}(T_{B}<T_{A}) and that this g0g_{0} satisfies

(g0,g0)=iBπiPi(TA<TB+).\mbox{${\cal E}$}(g_{0},g_{0})=\sum_{i\in B}\pi_{i}P_{i}(T_{A}<T^{+}_{B}). (3.95)

We want to interpret the reciprocal of this quantity as a mean time for the chain to commute from AA to BB and back. Consider the stationary chain (Xt;-<t<)(X_{t};-\infty<t<\infty). We can define what is technically called a “marked point process” which records the times at which AA is first visited after a visit to BB and vice versa. Precisely, define ZtZ_{t} taking values in {α,β,δ}\{\alpha,\beta,\delta\} by

Zt:={βif s<t\exists s<t such that XsAX_{s}\in AXtBX_{t}\in BXuABs<u<tX_{u}\not\in A\cup B\ \forall s<u<tαif s<t\exists s<t such that XsBX_{s}\in BXtAX_{t}\in AXuBAs<u<tX_{u}\not\in B\cup A\ \forall s<u<tδotherwise.Z_{t}:=\left\{\begin{array}[]{cl}\beta&\mbox{if $\exists s<t$ such that $X_{s}\in A$, $X_{t}\in B$, $X_{u}\not\in A\cup B\ \forall s<u<t$}\\ \alpha&\mbox{if $\exists s<t$ such that $X_{s}\in B$, $X_{t}\in A$, $X_{u}\not\in B\cup A\ \forall s<u<t$}\\ \delta&\mbox{otherwise.}\end{array}\right.

So the times tt when Zt=βZ_{t}=\beta are the times of first return to BB after visiting AA, and the times tt when Zt=αZ_{t}=\alpha are the times of first return to AA after visiting BB. Now (Zt)(Z_{t}) is a stationary process. By considering the time-reversal of XX, we see that for iBi\in B

P(X0=i,Z0=β)=P(X0=i,TA<TB+)=πiPi(TA<TB+).P(X_{0}=i,\,Z_{0}=\beta)=P(X_{0}=i,\,T_{A}<T^{+}_{B})=\pi_{i}P_{i}(T_{A}<T^{+}% _{B}).

So  (3.95) shows P(Z0=β)=(g0,g0)P(Z_{0}=\beta)=\mbox{${\cal E}$}(g_{0},g_{0}). If we define TBABT_{BAB}, “the typical time to go from BB to AA and back to BB”, to have the conditional distribution of margin: (discrete time chain) min{t1:Zt=β}\min\{t\geq 1:Z_{t}=\beta\} given Z0=βZ_{0}=\beta, then Kac’s formula for the (non-Markov) stationary process ZZ (see e.g. [133] Theorem 6.3.3) says that ETBAB=1/P(Z0=β)ET_{BAB}=1/P(Z_{0}=\beta). So we have proved

Corollary 3.37
ETBAB=sup{1/(g,g):0g1,  g()=0 on AA,  g()=1 on BB}ET_{BAB}=\sup\{1/\mbox{${\cal E}$}(g,g):0\leq g\leq 1,\ g(\cdot)=0\mbox{\rm\ % on~{}$A$},\ g(\cdot)=1\mbox{\rm\ on~{}$B$}\}

and the sup is attained by g(i)=Pi(TB<TA)g(i)=P_{i}(T_{B}<T_{A}).

As another interpretation of this quantity, define

ρB()=P(X0|Z0=β),ρA()=P(X0|Z0=α).\rho_{B}(\cdot)=P(X_{0}\in\cdot\,|\,Z_{0}=\beta),\ \ \rho_{A}(\cdot)=P(X_{0}% \in\cdot\,|\,Z_{0}=\alpha).

Interpret ρB\rho_{B} and ρA\rho_{A} as the distribution of hitting places on BB and on AA in the commute process. It is intuitively clear, and not hard to verify, that

PρA(X(TB))=ρB(),PρB(X(TA))=ρA()P_{\rho_{A}}(X(T_{B})\in\cdot)=\rho_{B}(\cdot),\ \ P_{\rho_{B}}(X(T_{A})\in% \cdot)=\rho_{A}(\cdot)
ETBAB=EρBTA+EρATB.ET_{BAB}=E_{\rho_{B}}T_{A}+E_{\rho_{A}}T_{B}.

In particular

miniBEiTA+miniAEiTBETBABmaxiBEiTA+maxiAEiTB.\min_{i\in B}E_{i}T_{A}+\min_{i\in A}E_{i}T_{B}\leq ET_{BAB}\leq\max_{i\in B}E% _{i}T_{A}+\max_{i\in A}E_{i}T_{B}.

3.7.1 Thompson’s principle and leveling networks

Theorem 3.36 was stated in terms of (reversible) Markov chains. Rephrasing in terms of discrete-time random walk on a weighted graph gives the usual “electrical network” formulation of the Dirichlet principle stated below, using (3.77),(3.91) and (3.94). Recall from Proposition 3.10 that the effective resistance rr between v0v_{0} and AA is, in terms of the random walk,

r=1wv0Pv0(TA<Tv0+).r=\frac{1}{w_{v_{0}}P_{v_{0}}(T_{A}<T^{+}_{v_{0}})}. (3.96)
Proposition 3.38 (The Dirichlet principle)

Take a weighted graph and fix a vertex v0v_{0} and a subset AA of vertices not containing v0v_{0}. Then the quantity 12ijwij(g(j)-g(i))2\frac{1}{2}\sum_{i}\sum_{j}w_{ij}(g(j)-g(i))^{2} is minimized, over all functions g:I[0,1]g:I\rightarrow[0,1] with g(v0)=1g(v_{0})=1 and g()=0g(\cdot)=0 on AA, by the function g(i):Pi(Tv0<TA)g(i):\equiv P_{i}(T_{v_{0}}<T_{A}) (where probabilities refer to random walk on the weighted graph), and the minimum value equals 1/r1/r, where rr is the effective resistance (3.96).

There is a dual form of the Dirichlet principle, which following Doyle and Snell [131] we call

Proposition 3.39 (Thompson’s principle)

Take a weighted graph and fix a vertex v0v_{0} and a subset AA of vertices not containing v0v_{0}. Let 𝐟=fij{\bf f}=f_{ij} denote a unit flow from v0v_{0} to AA. Then 12ij(fij2/wij)\frac{1}{2}\sum_{i}\sum_{j}(f^{2}_{ij}/w_{ij}) is minimized, over all such flows, by the flow 𝐟v0A{\bf f}^{v_{0}\rightarrow A} [defined at (3.18)] associated with the random walk from v0v_{0} to AA, and the minimum value equals the effective resistance rr appearing in (3.96).

Recall that a flow is required to have fij=0f_{ij}=0 whenever wij=0w_{ij}=0, and interpret sums ij\sum_{i}\sum_{j} as sums over ordered pairs (i,j)(i,j) with wij>0w_{ij}>0.

Proof. Write ψ(𝐟):=12ij(fij2/wij)\psi({\bf f}):=\frac{1}{2}\sum_{i}\sum_{j}(f^{2}_{ij}/w_{ij}). By formula (3.25) relating the random walk notions of “flow” and “potential”, the fact that ψ(𝐟v0A)=r\psi({\bf f}^{v_{0}\rightarrow A})=r is immediate from the corresponding equality in the Dirichlet principle. So the issue is to prove that for a unit flow 𝐟*{\bf f}^{*}, say, attaining the minimum of ψ(𝐟)\psi({\bf f}), we have ψ(𝐟*)=ψ(𝐟v0A)\psi({\bf f}^{*})=\psi({\bf f}^{v_{0}\rightarrow A}). To prove this, consider two arbitrary paths (yi)(y_{i}) and (zj)(z_{j}) from v0v_{0} to AA, and let 𝐟ε{\bf f}^{\varepsilon} denote the flow 𝐟*{\bf f}^{*} modified by adding flow rates +ε+\varepsilon along the edges (yi,yi+1)(y_{i},y_{i+1}) and by adding flow rates -ε-\varepsilon along the edges (zi,zi+1)(z_{i},z_{i+1}). Then 𝐟ε{\bf f}^{\varepsilon} is still a unit flow from v0v_{0} to AA. So the function εψ(𝐟ε)\varepsilon\rightarrow\psi({\bf f}^{\varepsilon}) must have derivative zero at ε=0\varepsilon=0, and this becomes the condition that

i(fyi,yi+1*/wyi,yi+1)=i(fzi,zi+1*/wzi,zi+1).\sum_{i}(f^{*}_{y_{i},y_{i+1}}/w_{y_{i},y_{i+1}})=\sum_{i}(f^{*}_{z_{i},z_{i+1% }}/w_{z_{i},z_{i+1}}).

So the sum is the same for all paths from v0v_{0} to AA. Fixing xx, the sum must be the same for all paths from xx to AA, because two paths from xx to AA could be extended to paths from v0v_{0} to AA by appending a common path from v0v_{0} to xx. It follows that we can define g*(x)g^{*}(x) as the sum i(fxi,xi+1*/wxi,xi+1)\sum_{i}(f^{*}_{x_{i},x_{i+1}}/w_{x_{i},x_{i+1}}) over some path (xi)(x_{i}) from xx to AA, and the sum does not depend on the path chosen. So margin: (This is essentially the same argument used in Section 3.3.2.)

g*(x)-g*(z)=fxz*wxz for each edge (x,z)(x,z) not contained within AA.g^{*}(x)-g^{*}(z)=\frac{f^{*}_{xz}}{w_{xz}}\mbox{\ for each edge $(x,z)$ not % contained within~{}$A$}. (3.97)

The fact that 𝐟*{\bf f}^{*} is a flow means that, for xA{v0}x\not\in A\cup\{v_{0}\},

0=z:wxz>0fxz*=zwxz(g*(x)-g*(z)).0=\sum_{z:w_{xz}>0}f^{*}_{xz}=\sum_{z}w_{xz}(g^{*}(x)-g^{*}(z)).

So g*g^{*} is a harmonic function outside A{v0}A\cup\{v_{0}\}, and g*=0g^{*}=0 on AA. So by the uniqueness result (Chapter 2 margin: 9/10/99 versionLemma 27) we have that g*g^{*} must be proportional to gg, the minimizing function in Proposition 3.38. So 𝐟*{\bf f}^{*} is proportional to 𝐟v0A{\bf f}^{v_{0}\rightarrow A}, because the relationship (3.97) holds for both, and then 𝐟*=𝐟v0A{\bf f}^{*}={\bf f}^{v_{0}\rightarrow A} because both are unit flows.    

A remarkable statistical interpretation was discussed in a monograph of Borre and Meissl [57]. Imagine a finite set of locations such as hilltops. For each pair of locations (i,j)(i,j) with a clear line-of-sight, measure the elevation difference Dij=D_{ij}= (height of jj minus height of ii). Consider the associated graph [whose edges are such pairs (i,j)(i,j)], and suppose it is connected. Take one location v0v_{0} as a benchmark “height 00”. If our measurements were exact we could determine the height of location xx by adding the DD’s along a path from v0v_{0} to xx, and the sum would not depend on the path chosen. But suppose our measurements contain random errors. Precisely, suppose Di,jD_{i,j} equals the true height difference h(j)-h(i)h(j)-h(i) plus an error YijY_{ij} which has mean 00 and variance 1/wij1/w_{ij} and is independent for different measurements. Then it seems natural to estimate the height of xx by taking some average h^(x)\hat{h}(x) over paths from v0v_{0} to xx, and it turns out that the “best” way to average is to use the random walk from v0v_{0} to xx and average (over realizations of the walk) the net height climbed by the walk.

In mathematical terms, the problem is to choose weights fijf_{ij}, not depending on the function hh, such that

h^(x):=12ijfijDij\hat{h}(x):=\frac{1}{2}\sum_{i}\sum_{j}f_{ij}D_{ij}

has Eh^(x)=h(x)E\hat{h}(x)=h(x) and minimal variance. It is not hard to see that the former “unbiased” property holds iff ff is a unit flow from v0v_{0} to xx. Then

varh^(x)=14ijfij2var(Dij)=14ijfij2wij{\rm var}\ \hat{h}(x)=\frac{1}{4}\sum_{i}\sum_{j}f^{2}_{ij}\mbox{var}(D_{ij})=% \frac{1}{4}\sum_{i}\sum_{j}\frac{f^{2}_{ij}}{w_{ij}}

and Proposition 3.39 says this is minimized when we use the flow from v0v_{0} to xx obtained from the random walk on the weighted graph. But then

h^(x)=Ev0t=1TxDXt-1Xt,\hat{h}(x)=E_{v_{0}}\sum_{t=1}^{T_{x}}D_{X_{t-1}X_{t}},

the expectation referring to the random walk.

3.7.2 Hitting times and Thompson’s principle

Using the commute interpretation of resistance (Corollary 3.11) to translate Thompson’s principle into an assertion about mean commute times gives the following.

Corollary 3.40

For random walk on a weighted graph and distinct vertices vv and aa,

EvTa+EaTv=winf{12ij(fij2/wij):𝐟 is a unit flow from aa to vv}E_{v}T_{a}+E_{a}T_{v}=w\,\inf\left\{\frac{1}{2}\sum_{i}\sum_{j}(f^{2}_{ij}/w_{% ij}):{\bf f}\mbox{\rm\ is a unit flow from~{}$a$ to~{}$v$}\right\}

and the min\min is attained by the flow 𝐟av{\bf f}^{a\rightarrow v} associated with the random walk.

Comparing with Theorem 3.36 we have two different extremal characterizations of mean commute times, as a sup over potential functions and as an inf over flows. In practice this “flow” form is less easy to use than the “potential” form, because writing down a flow 𝐟{\bf f} is harder than writing down a function gg. But, when we can write down and calculate with some plausible flow, it gives upper bounds on mean commute times.

One-sided mean hitting times EiTjE_{i}T_{j} don’t have simple extremal characterizations of the same kind, with the exception of hitting times from stationarity. To state the result, we need two definitions. First, given a probability distribution ρ\rho on vertices, a unit flow from aa to ρ\rho is a flow ff satisfying

f(i)=1(i=a)-ρi  for all i;f_{(i)}=1_{(i=a)}-\rho_{i}\mbox{\ \ for all\ }i; (3.98)

more generally, a unit flow from a set AA to ρ\rho is defined to satisfy

iAf(i)=1-ρ(A)  and  f(i)=-ρi for all iAci\in A^{c}.\sum_{i\in A}f_{(i)}=1-\rho(A)\mbox{\ \ and\ \ }f_{(i)}=-\rho_{i}\mbox{\ for % all $i\in A^{c}$}.

Now fix a state aa and define the special flow 𝐟aπ{\bf f}^{a\rightarrow\pi} by

fij:=limt0Eat=1t0(1(Xt-1=i,Xt=j)-1(Xt-1=j,Xt=i))f_{ij}:=\lim_{t_{0}\rightarrow\infty}E_{a}\sum_{t=1}^{t_{0}}\left(1_{(X_{t-1}=% i,\,X_{t}=j)}-1_{(X_{t-1}=j,\,X_{t}=i)}\right) (3.99)

with the usual convention in the periodic case. So fijf_{ij} is the mean excess of transitions iji\rightarrow j compared to transitions jij\rightarrow i, for the chain started at aa and run forever. This is a unit flow from aa to π\pi, in the above sense. Equation (6) (the definition of 𝐙{\bf Z}) in Chapter 2 margin: 9/10/99 versionand reversibility give the first equality, and Chapter 2 margin: 9/10/99 versionLemma 12 gives the last equality, in

fij\displaystyle f_{ij} =\displaystyle= Zaipij-Zajpji\displaystyle Z_{ai}p_{ij}-Z_{aj}p_{ji} (3.100)
=\displaystyle= Ziaπipijπa-Zjaπjpjiπa\displaystyle\frac{Z_{ia}\pi_{i}p_{ij}}{\pi_{a}}-\frac{Z_{ja}\pi_{j}p_{ji}}{% \pi_{a}}
=\displaystyle= (Zia-Zja)πipijπa\displaystyle\frac{(Z_{ia}-Z_{ja})\pi_{i}p_{ij}}{\pi_{a}}
=\displaystyle= (EjTa-EiTa)wijw,\displaystyle\frac{(E_{j}T_{a}-E_{i}T_{a})w_{ij}}{w}, (3.101)

switching to “weighted graphs” notation. Note also that the first-step recurrence for the function iZiai\mapsto Z_{ia} is

Zia=(1(i=a)-πa)+jpijZja.Z_{ia}=\left(1_{(i=a)}-\pi_{a}\right)+\sum_{j}p_{ij}Z_{ja}. (3.102)
Proposition 3.41

For random walk on a weighted graph and a subset AA of vertices,

EπTA\displaystyle E_{\pi}T_{A} =\displaystyle= winf{12ij(fij2/wij):𝐟 is a unit flow from AA to π\pi}\displaystyle w\,\inf\left\{\frac{1}{2}\sum_{i}\sum_{j}(f^{2}_{ij}/w_{ij}):{% \bf f}\mbox{\rm\ is a unit flow from~{}$A$ to~{}$\pi$}\right\}
=\displaystyle= sup{1/(g,g):-<g<,  g()=1 on AA,  iπig(i)=0}.\displaystyle\sup\left\{1/\mbox{${\cal E}$}(g,g):-\infty<g<\infty,\ g(\cdot)=1% \mbox{\rm\ on~{}$A$},\ \sum_{i}\pi_{i}g(i)=0\right\}.

When AA is a singleton {a}\{a\}, the minimizing flow is the flow 𝐟aπ{\bf f}^{a\rightarrow\pi} defined above, and the maximizing function gg is g(i)=Zia/Zaag(i)=Z_{ia}/Z_{aa}. For general AA the maximizing function gg is g(i)=1-EiTAEπTAg(i)=1-\frac{E_{i}T_{A}}{E_{\pi}T_{A}}.

margin: Added the final observation about general AA

Proof. Suppose first that A={a}A=\{a\}. We start by showing that the extremizing flow, 𝐟*{\bf f}^{*} say, is the asserted 𝐟aπ{\bf f}^{a\rightarrow\pi}. By considering adding to 𝐟*{\bf f}^{*} a flow of size ε\varepsilon along a directed cycle, and copying the argument for (3.97) in the proof of Proposition 3.39, there must exist a function g*g^{*} such that

g*(x)-g*(z)=fxz*wxz for each edge (x,z).g^{*}(x)-g^{*}(z)=\frac{f^{*}_{xz}}{w_{xz}}\mbox{\ for each edge\ }(x,z). (3.103)

The fact that 𝐟*{\bf f}^{*} is a unit flow from aa to π\pi says that

1(x=a)-πx=zfxz*=zwxz(g*(x)-g*(z))1_{(x=a)}-\pi_{x}=\sum_{z}f^{*}_{xz}=\sum_{z}w_{xz}(g^{*}(x)-g^{*}(z))

which implies

1(x=a)-πxwx=zpxz(g*(x)-g*(z))=g*(x)-zpxzg*(z).\frac{1_{(x=a)}-\pi_{x}}{w_{x}}=\sum_{z}p_{xz}(g^{*}(x)-g^{*}(z))=g^{*}(x)-% \sum_{z}p_{xz}g^{*}(z).

Since wx=wπxw_{x}=w\pi_{x} and 1/πa=EaTa+1/\pi_{a}=E_{a}T^{+}_{a}, this becomes

g*(x)=zpxzg*(z)-w-1(1-(EaTa+)1(x=a)).g^{*}(x)=\sum_{z}p_{xz}g^{*}(z)-w^{-1}\left(1-(E_{a}T^{+}_{a})1_{(x=a)}\right).

Now these equations have a unique solution g*g^{*}, up to an additive constant, because the difference between two solutions is a harmonic function. (Recall Chapter 2 margin: 9/10/99 versionCorollary 28.) On the other hand, a solution is g*(x)=-ExTawg^{*}(x)=-\frac{E_{x}T_{a}}{w}, by considering the first-step recurrence for ExTa+E_{x}T^{+}_{a}. So by (3.103) fxz*=(EzTa-ExTa)wxz/wf^{*}_{xz}=(E_{z}T_{a}-E_{x}T_{a})w_{xz}/w, and so 𝐟*=𝐟aπ{\bf f}^{*}={\bf f}^{a\rightarrow\pi} by (3.101).

Now consider the function gg which minimizes (g,g)\mbox{${\cal E}$}(g,g) under the constraints iπig(i)=0\sum_{i}\pi_{i}g(i)=0 and g(a)=1g(a)=1. By introducing a Lagrange multiplier γ\gamma we may consider gg as minimizing (g,g)+γiπig(i)\mbox{${\cal E}$}(g,g)+\gamma\sum_{i}\pi_{i}g(i) subject to g(a)=1g(a)=1. Repeating the argument at (3.92), the minimizing gg satisfies

-2kjπjpjk(g(k)-g(j))+γπj=0, ja.-2\sum_{k\neq j}\pi_{j}p_{jk}(g(k)-g(j))\ +\gamma\pi_{j}=0,\ \ j\neq a.

Rearranging, and introducing a term β1(j=a)\beta 1_{(j=a)} to cover the case j=aj=a, we have

g(j)=kpjkg(k)-(γ/2)+β1(j=a) for all j,g(j)=\sum_{k}p_{jk}g(k)\ -(\gamma/2)+\beta 1_{(j=a)}\mbox{\ for all\ }j,

for some γ,β\gamma,\beta. Because jπjg(j)=0\sum_{j}\pi_{j}g(j)=0 we have

0=0-(γ/2)+βπa,0=0-(\gamma/2)+\beta\pi_{a},

allowing us to rewrite the equation as

g(j)=kpjkg(k)+β(1(j=a)-πa).g(j)=\sum_{k}p_{jk}g(k)\ +\beta(1_{(j=a)}-\pi_{a}).

By the familiar “harmonic function” argument this has a unique solution, and (3.102) shows the solution is g(j)=βZjag(j)=\beta Z_{ja}. Then the constraint g(a)=1g(a)=1 gives g(j)=Zja/Zaag(j)=Z_{ja}/Z_{aa}.

Next consider the relationship between the flow 𝐟=𝐟aπ{\bf f}={\bf f}^{a\rightarrow\pi} and the function g(i)Zia/Zaag(i)\equiv Z_{ia}/Z_{aa}. We have margin: Chapter 2 reference in following display is to 9/10/99 version.

wfij2wij\displaystyle w\ \frac{f^{2}_{ij}}{w_{ij}} =\displaystyle= (EjTa-EiTa)2wijw  by (3.101)\displaystyle(E_{j}T_{a}-E_{i}T_{a})^{2}\frac{w_{ij}}{w}\mbox{\ \ by~{}(\ref{% flT})}
=\displaystyle= (EπTa)2wijw(EjTa-EiTaEπTa)2\displaystyle(E_{\pi}T_{a})^{2}\frac{w_{ij}}{w}\left(\frac{E_{j}T_{a}-E_{i}T_{% a}}{E_{\pi}T_{a}}\right)^{2}
=\displaystyle= (EπTa)2wijw(Zia-ZjaZaa)2  by Chapter 2 Lemmas 11, 12\displaystyle(E_{\pi}T_{a})^{2}\frac{w_{ij}}{w}\left(\frac{Z_{ia}-Z_{ja}}{Z_{% aa}}\right)^{2}\mbox{\ \ by Chapter~{}2{} Lemmas~{}11, 12}
=\displaystyle= (EπTa)2wijw(g(i)-g(j))2.\displaystyle(E_{\pi}T_{a})^{2}\frac{w_{ij}}{w}\ (g(i)-g(j))^{2}.

Thus it is enough to prove

w12ijfij2wij=EπTaw\frac{1}{2}\sum_{i}\sum_{j}\frac{f^{2}_{ij}}{w_{ij}}=E_{\pi}T_{a} (3.104)

and it will then follow that

1/(g,g)=EπTa.1/\mbox{${\cal E}$}(g,g)=E_{\pi}T_{a}.

To prove (3.104), introduce a parameter ε\varepsilon (which will later go to 00) and a new vertex zz and edge-weights wiz=εwiw_{iz}=\varepsilon w_{i}. Writing superscripts ε\ {}^{\varepsilon} to refer to this new graph and its random walk, the equation becomes wizε:=εwiw^{\varepsilon}_{iz}:=\varepsilon w_{i} and Corollary 3.40 says

EaεTz+EzεTa=wε12iIεjIε(fijε)2wijεE^{\varepsilon}_{a}T_{z}+E^{\varepsilon}_{z}T_{a}=w^{\varepsilon}\frac{1}{2}% \sum_{i\in I^{\varepsilon}}\sum_{j\in I^{\varepsilon}}\frac{(f^{\varepsilon}_{% ij})^{2}}{w^{\varepsilon}_{ij}} (3.105)

where 𝐟ε{\bf f}^{\varepsilon} is the special unit flow from aa to zz associated with the new graph (which has vertex set Iε:=I{z}I^{\varepsilon}:=I\cup\{z\}). We want to interpret the ingredients to (3.105) in terms of the original graph. Clearly wε=w(1+2ε)w^{\varepsilon}=w(1+2\varepsilon). The new walk has chance ε/(1+ε)\varepsilon/(1+\varepsilon) to jump to zz from each other vertex, so EaεTz=(1+ε)/εE^{\varepsilon}_{a}T_{z}=(1+\varepsilon)/\varepsilon. Starting from zz, after one step the new walk has the stationary distribution π\pi on the original graph, and it follows easily that EzεTa=1+EπTa(1+O(ε))E^{\varepsilon}_{z}T_{a}=1+E_{\pi}T_{a}(1+O(\varepsilon)). We can regard the new walk up to time TzT_{z} as the old walk sent to zz at a random time UεU^{\varepsilon} with Geometric(ε/(1+ε)\varepsilon/(1+\varepsilon)) distribution, so for izi\neq z and jzj\neq z the flow fijεf^{\varepsilon}_{ij} is the expected net number of transitions iji\rightarrow j by the old walk up to time UεU^{\varepsilon}. From the spectral representation it follows easily that fijε=fij+O(ε)f^{\varepsilon}_{ij}=f_{ij}+O(\varepsilon). Similarly, for izi\neq z we have -fziε=fizε=Pa(X(Uε-1)=i)=πi+O(ε)-f^{\varepsilon}_{zi}=f^{\varepsilon}_{iz}=P_{a}(X(U^{\varepsilon}-1)=i)=\pi_{% i}+O(\varepsilon); noting that iIfizε=1\sum_{i\in I}f^{\varepsilon}_{iz}=1, the total contribution of such terms to the double sum in (3.105) is

2iI(πi+fizε-πi)2εwi\displaystyle 2\sum_{i\in I}\frac{(\pi_{i}+f^{\varepsilon}_{iz}-\pi_{i})^{2}}{% \varepsilon w_{i}} =\displaystyle= 2wεiI(πi+fizε-πi)2πi\displaystyle\frac{2}{w\varepsilon}\sum_{i\in I}\frac{(\pi_{i}+f^{\varepsilon}% _{iz}-\pi_{i})^{2}}{\pi_{i}}
=\displaystyle= 2wε(1+iI(fizε-πi)2πi)=2wε+O(ε).\displaystyle\frac{2}{w\varepsilon}\left(1+\sum_{i\in I}\frac{(f^{\varepsilon}% _{iz}-\pi_{i})^{2}}{\pi_{i}}\right)=\frac{2}{w\varepsilon}+O(\varepsilon).

So (3.105) becomes

1+εε+1+EπTa+O(ε)=w(1+2ε)(12iIjIfij2wij+1wε)+O(ε)\frac{1+\varepsilon}{\varepsilon}+1+E_{\pi}T_{a}+O(\varepsilon)=w(1+2% \varepsilon)\left(\frac{1}{2}\sum_{i\in I}\sum_{j\in I}\frac{f^{2}_{ij}}{w_{ij% }}\ +\frac{1}{w\varepsilon}\right)+O(\varepsilon)

Subtracting (1+2ε)/ε(1+2\varepsilon)/\varepsilon from both sides and letting ε0\varepsilon\rightarrow 0 gives the desired (3.104). This concludes the proof for the case A={a}A=\{a\}, once we use the mean hitting time formula to verify

g(i):ZiaZaa=1-EiTaEπTa.g(i):\equiv{\textstyle\frac{Z_{ia}}{Z_{aa}}}=1-{\textstyle\frac{E_{i}T_{a}}{E_% {\pi}T_{a}}}.

Finally, the extension to general AA is an exercise in use of the chain, say X*X^{*}, in which the set AA is collapsed to a single state aa. Recall Chapter 2 margin: 9/10/99 versionSection 7.3. In particular,

wij*=wij,  i,jAc; wia*=kAwik,  iAc; waa*=kAlAwkl; w*=w.w^{*}_{ij}=w_{ij},\ i,j\in A^{c};\quad w^{*}_{ia}=\sum_{k\in A}w_{ik},\ i\in A% ^{c};\quad w^{*}_{aa}=\sum_{k\in A}\sum_{l\in A}w_{kl};\quad w^{*}=w.

We now sketch the extension. First, EπTA=Eπ**TaE_{\pi}T_{A}=E^{*}_{\pi^{*}}T_{a}. Then the natural one-to-one correspondence between functions gg on II with g()=1g(\cdot)=1 on AA and iIπig(i)=0\sum_{i\in I}\pi_{i}g(i)=0 and functions g*g^{*} on I*I^{*} with g*(a)=0g^{*}(a)=0 and iI*πi*g*(i)=0\sum_{i\in I^{*}}\pi^{*}_{i}g^{*}(i)=0 gives a trivial proof of

EπTA=sup{1/(g,g):-<g<,  g()=1 on AA,  iπig(i)=0}.E_{\pi}T_{A}=\sup\left\{1/\mbox{${\cal E}$}(g,g):-\infty<g<\infty,\ g(\cdot)=1% \mbox{\rm\ on~{}$A$},\ \sum_{i}\pi_{i}g(i)=0\right\}.

It remains to show that

inf{Ψ(𝐟):𝐟 is a unit flow from AA to π\pi (for XX)}\displaystyle\inf\left\{\Psi({\bf f}):{\bf f}\mbox{\rm\ is a unit flow from~{}% $A$ to~{}$\pi$ (for~{}$X$)}\right\} (3.106)
=\displaystyle= inf{Ψ*(𝐟*):𝐟* is a unit flow from aa to π*\pi^{*} (for X*X^{*})}\displaystyle\inf\left\{\Psi^{*}({\bf f^{*}}):{\bf f^{*}}\mbox{\rm\ is a unit % flow from~{}$a$ to~{}$\pi^{*}$ (for~{}$X^{*}$)}\right\}

where

Ψ(𝐟):=12iIjI(fij2/wij), Ψ*(𝐟*):=12iI*jI*((fij*)2/wij*).\Psi({\bf f}):=\frac{1}{2}\sum_{i\in I}\sum_{j\in I}(f^{2}_{ij}/w_{ij}),\quad% \Psi^{*}({\bf f^{*}}):=\frac{1}{2}\sum_{i\in I^{*}}\sum_{j\in I^{*}}((f^{*}_{% ij})^{2}/w^{*}_{ij}).

Indeed, given a unit flow 𝐟{\bf f} from AA to π\pi, define fij*f^{*}_{ij} as fijf_{ij} if i,jAci,j\in A^{c} and as kAfik\sum_{k\in A}f_{ik} if iAci\in A^{c} and j=aj=a. One can check that 𝐟*{\bf f^{*}} is a unit flow from aa to π*\pi^{*} (the key observation being that iAjAfij=0\sum_{i\in A}\sum_{j\in A}f_{ij}=0) and, using the Cauchy–Schwarz inequality, that Ψ*(𝐟*)Ψ(𝐟)\Psi^{*}({\bf f^{*}})\leq\Psi({\bf f}). Conversely, given a unit flow 𝐟*{\bf f^{*}} from aa to π*\pi^{*}, define fijf_{ij} as fij*f^{*}_{ij} if i,jAci,j\in A^{c}, as fia*wij/kAwikf^{*}_{ia}w_{ij}/\sum_{k\in A}w_{ik} if iAci\in A^{c} and jAj\in A, and as 00 if i,jAi,j\in A. One can check that 𝐟{\bf f} is a unit flow from AA to π\pi and that Ψ(𝐟)=Ψ(𝐟*)\Psi({\bf f})=\Psi({\bf f^{*}}). We have thus established (3.106), completing the proof.    

Corollary 3.42

For chains with transition matrices 𝐏,𝐏~{\bf P},\tilde{{\bf P}} and the same stationary distribution π\pi,

minijpijp~ijEπ~TaEπTamaxijpijp~ij.\min_{i\neq j}\frac{p_{ij}}{\tilde{p}_{ij}}\leq\frac{\tilde{E_{\pi}}T_{a}}{E_{% \pi}T_{a}}\leq\max_{i\neq j}\frac{p_{ij}}{\tilde{p}_{ij}}.

Proof. Plug the minimizing flow 𝐟aπ{\bf f}^{a\rightarrow\pi} for the 𝐏{\bf P}-chain into Proposition 3.41 for the 𝐏~\tilde{{\bf P}}-chain to get the second inequality. The first follows by reversing the roles of 𝐏{\bf P} and 𝐏~\tilde{{\bf P}}.