5 Examples: Special Graphs and Trees (April 23 1996)

5.3 Trees

For random walk on a finite tree, we can develop explicit formulas for means and variances of first passage times, and for distributions of first hitting places. We shall only treat the unweighted case, but the formulas can be extended to the weighted case without difficulty.

xxx notation below —change ww to xx ? Used i,j,v,w,xi,j,v,w,x haphazardly for vertices.

In this section we’ll write rvr_{v} for the degree of a vertex vv, and d(v,x)d(v,x) for the distance between vv and xx. On a tree we may unambiguously write [v,x][v,x] for the path from vv to xx. Given vertices j,v1,v2,j,v_{1},v_{2},\ldots in a tree, the intersection of the paths [j,v1],  [j,v2],[j,v_{1}],\ [j,v_{2}],\ldots is a (maybe trivial) path; write d(j,v1v2)0d(j,v_{1}\wedge v_{2}\wedge\cdots)\geq 0 for the length of this intersection path.

On an nn-vertex tree, the random walk’s stationary distribution is


Recall from the beginning of this chapter that an edge (v,x)(v,x) of a graph is essential if its removal would disconnect the graph into two components A(v,x)A(v,x) and A(x,v)A(x,v), say, containing vv and xx respectively. Obviously, in a tree every edge is essential, so we get a lot of mileage out of the essential edge lemma (Lemma 5.1).

Theorem 5.20

Consider discrete-time random walk on an nn-vertex tree.

For each edge (i,j)(i,j),

EiTj=2|A(i,j)|-1E_{i}T_{j}=2|A(i,j)|-1 (5.81)
EiTj+EjTi=2(n-1).E_{i}T_{j}+E_{j}T_{i}=2(n-1). (5.82)

For arbitrary i,ji,j,

EiTj=-d(i,j)+2vd(j,iv)=vrvd(j,iv)E_{i}T_{j}=-d(i,j)+2\sum_{v}d(j,i\wedge v)=\sum_{v}r_{v}d(j,i\wedge v) (5.83)
EiTj+EjTi=2(n-1)d(i,j).E_{i}T_{j}+E_{j}T_{i}=2(n-1)d(i,j). (5.84)

For each edge (i,j)(i,j),

variTj=-EiTj+vA(i,j)wA(i,j)rvrw(2d(j,vw)-1).{\rm var}_{i}T_{j}=-E_{i}T_{j}+\sum_{v\in A(i,j)}\sum_{w\in A(i,j)}r_{v}r_{w}(% 2d(j,v\wedge w)-1). (5.85)

For arbitrary i,ji,j,

variTj=-EiTj+vwrvrwd(j,ivw)[2d(j,vw)-d(j,ivw)].{\rm var}_{i}T_{j}=-E_{i}T_{j}+\sum_{v}\sum_{w}r_{v}r_{w}d(j,i\wedge v\wedge w% )\left[2d(j,v\wedge w)-d(j,i\wedge v\wedge w)\right]. (5.86)

Remarks. 1. There are several equivalent expressions for the sums above: we chose the most symmetric-looking ones. We’ve written sums over vertices, but one could rephrase in terms of sums over edges.

2. In continuous time, the terms “-EiTj-E_{i}T_{j}” disappear from the variance formulas—see xxx.

Proof of Theorem 5.20. Equations (5.81) and (5.82) are rephrasings of (5.3) and (5.4) from the essential edge lemma. Equation (5.84) and the first equality in (5.83) follow from (5.82) and (5.81) by summing over the edges in the path [i,j][i,j]. Note alternatively that (5.84) can be regarded as a consequence of the commute interpretation of resistance, since the effective resistance between ii and jj is d(i,j)d(i,j). To get the second equality in (5.83),consider the following deterministic identity (whose proof is obvious), relating sums over vertices to sums over edges.

Lemma 5.21

Let ff be a function on the vertices of a tree, and let jj be a distinguished vertex. Then

vrvf(v)=vj(f(v)+f(v*))\sum_{v}r_{v}f(v)=\sum_{v\neq j}(f(v)+f(v^{*}))

where v*v^{*} is the first vertex (other than vv) in the path [v,j][v,j].

To apply to (5.83), note

d(j,iv*)\displaystyle d(j,i\wedge v^{*}) =\displaystyle= d(j,iv) if v[i,j]\displaystyle d(j,i\wedge v)\mbox{ if }v\not\in[i,j]
=\displaystyle= d(j,iv)-1 if v[i,j],  vj.\displaystyle d(j,i\wedge v)-1\mbox{ if }v\in[i,j],\ v\neq j.

The equality in Lemma 5.21 now becomes the equality in (5.83).

We prove (5.85) below. To derive (5.86) from it, sum over the edges in the path [i,j]=(i=i0,i1,,im=j)[i,j]=(i=i_{0},i_{1},\ldots,i_{m}=j) to obtain

variTj=-EiTj+vwl(2d(il+1,vw)-1){\rm var}_{i}T_{j}=-E_{i}T_{j}+\sum_{v}\sum_{w}\sum_{l}\left(2d(i_{l+1},v% \wedge w)-1\right) (5.87)

where l\sum_{l} denotes the sum over all 0lm-10\leq l\leq m-1 for which A(il,il+1)A(i_{l},i_{l+1}) contains both vv and ww. Given vertices vv and ww, there exist unique smallest values of pp and qq so that vA(ip,ip+1)v\in A(i_{p},i_{p+1}) and wA(iq,iq+1)w\in A(i_{q},i_{q+1}). If pqp\neq q, then the sum l\sum_{l} in (5.87) equals

l=pqm-1(2d(il+1,ipq)-1)\displaystyle\sum_{l=p\vee q}^{m-1}\left(2d(i_{l+1},i_{p\vee q})-1\right) =\displaystyle= l=pqm-1(2((l+1)-(pq))-1)\displaystyle\sum_{l=p\vee q}^{m-1}\left(2((l+1)-(p\vee q))-1\right)
=\displaystyle= (m-(pq))2=d2(j,vw)\displaystyle(m-(p\vee q))^{2}=d^{2}(j,v\wedge w)
=\displaystyle= d(j,ivw)[2d(j,vw)-d(j,ivw)],\displaystyle d(j,i\wedge v\wedge w)\left[2d(j,v\wedge w)-d(j,i\wedge v\wedge w% )\right],

as required by (5.86). If p=qp=q, then the sum l\sum_{l} in (5.87) equals

l=pm-1(2d(il+1,ip)+2d(ip,vw)-1)\sum_{l=p}^{m-1}\left(2d(i_{l+1},i_{p})+2d(i_{p},v\wedge w)-1\right)

which again equals d(j,ivw)[2d(j,vw)-d(j,ivw)]d(j,i\wedge v\wedge w)\left[2d(j,v\wedge w)-d(j,i\wedge v\wedge w)\right] by a similar calculation.

So it remains to prove (5.85), for which we may suppose, as in the proof of Lemma 5.1, that jj is a leaf. By considering the first step from jj to ii we have

varjTj+=variTj.{\rm var}_{j}T^{+}_{j}={\rm var}_{i}T_{j}.

Now yyy of Chapter 2 gives a general expression for varjTj+{\rm var}_{j}T^{+}_{j} in terms of EπTjE_{\pi}T_{j}, and in the present setting this becomes

varjTj+=2(n-1)-(2(n-1))2+v2rvEvTj.{\rm var}_{j}T^{+}_{j}=2(n-1)-(2(n-1))^{2}+\sum_{v}2r_{v}E_{v}T_{j}.

Using the second equality in (5.83), we may rewrite the sum as

vjwjrvrw2d(j,vw).\sum_{v\neq j}\sum_{w\neq j}r_{v}r_{w}2d(j,v\wedge w).


vjrv=2(n-1)-1.\sum_{v\neq j}r_{v}=2(n-1)-1.

Combining these expressions gives

variTj=-(2n-3)+vjwjrvrw(2d(j,vw)-1).{\rm var}_{i}T_{j}=-(2n-3)+\sum_{v\neq j}\sum_{w\neq j}r_{v}r_{w}(2d(j,v\wedge w% )-1).

But by (5.81), EiTj=2n-3E_{i}T_{j}=2n-3.    

5.3.1 Parameters for trees

Here we discuss the five parameters of Chapter 4. Obviously by (5.84)

τ*=2(n-1)Δ\tau^{*}=2(n-1)\Delta (5.88)

where Δ\Delta is the diameter of the tree. As for τc\tau_{c}, it is clear that the sup in its definition is attained by A(v,w)A(v,w) for some edge (v,w)(v,w). Note that

π(A(v,w))=2|A(v,w)|-12(n-1).\pi(A(v,w))=\frac{2|A(v,w)|-1}{2(n-1)}. (5.89)

This leads to

τc\displaystyle\tau_{c} =\displaystyle= max(v,w)2|A(v,w)|-12(n-1)2|A(w,v)|-12(n-1)12(n-1)\displaystyle\max_{(v,w)}\frac{\frac{2|A(v,w)|-1}{2(n-1)}\frac{2|A(w,v)|-1}{2(% n-1)}}{\frac{1}{2(n-1)}} (5.90)
=\displaystyle= max(v,w)4|A(v,w)||A(w,v)|-2n+12(n-1).\displaystyle\max_{(v,w)}\frac{4|A(v,w)||A(w,v)|-2n+1}{2(n-1)}.

Obviously the max is attained by an edge for which |A(v,w)||A(v,w)| is as close as possible to n/2n/2. This is one of several notions of “centrality” of vertices and edges which arise in our discussion—see Buckley and Harary [81] for a treatment of centrality in the general graph context, and for the standard graph-theoretic terminology.

Proposition 5.22

On an nn-vertex tree,

τ0=12+2n(v,w)[|A(v,w)||A(w,v)|-12(n-1)(|A(v,w)|2+|A(w,v)|2)]\tau_{0}=\frac{1}{2}+\frac{2}{n}\sum_{(v,w)}\left[|A(v,w)||A(w,v)|-\frac{1}{2(% n-1)}\left(|A(v,w)|^{2}+|A(w,v)|^{2}\right)\right]

where (v,w)\sum_{(v,w)} denotes the sum over all undirected edges (v,w)(v,w).

Proof. Using the formula for the stationary distribution, for each ii


Appealing to Lemma 5.21 (with ii as the distinguished vertex)


where a(i,i)=0a(i,i)=0 and a(i,j)=ExTja(i,j)=E_{x}T_{j}, where (j,x)(j,x) is the first edge of the path [j,i][j,i]. Taking the (unweighted) average over ii,


Each term EiTjE_{i}T_{j} is the sum of terms EvTwE_{v}T_{w} along the edges (v,w)(v,w) of the path [i,j][i,j]. Counting how many times a directed edge (v,w)(v,w) appears,


where we sum over directed edges (v,w)(v,w). Changing to a sum over undirected edges, using EvTw+EwTv=2(n-1)E_{v}T_{w}+E_{w}T_{v}=2(n-1) and EvTw=2|A(v,w)|-1E_{v}T_{w}=2|A(v,w)|-1, gives

2n(n-1)τ0\displaystyle 2n(n-1)\tau_{0} =\displaystyle= (v,w)[2|A(v,w)||A(w,v)|2(n-1)\displaystyle\sum_{(v,w)}\left[2|A(v,w)||A(w,v)|2(n-1)\right.
 -|A(v,w)|(2|A(v,w)|-1)\displaystyle\ \ -|A(v,w)|(2|A(v,w)|-1)
-|A(w,v)|(2|A(w,v)|-1)].\displaystyle\left.\ \ -|A(w,v)|(2|A(w,v)|-1)\right].

This simplifies to the assertion of the Proposition.    

For τ1\tau_{1} we content ourselves with a result “up to equivalence”.

Proposition 5.23

There exist constants K1,K2<K_{1},K_{2}<\infty such that

1K1minimaxjEjTiτ1K2minimaxjEjTi.\frac{1}{K_{1}}\min_{i}\max_{j}E_{j}T_{i}\leq\tau_{1}\leq K_{2}\min_{i}\max_{j% }E_{j}T_{i}.

Of course the expectations can be computed by (5.83).

Proof. We work with the parameter


which we know is equivalent to τ1\tau_{1}. Write


Fix an ii attaining the minimum. For arbitrary jj we have (the first equality uses the random target lemma, cf. the proof of Chapter 4 Lemma yyy)

kπk|EjTk-EiTk|\displaystyle\sum_{k}\pi_{k}|E_{j}T_{k}-E_{i}T_{k}| =\displaystyle= 2kπk(EjTk-EiTk)+\displaystyle 2\sum_{k}\pi_{k}(E_{j}T_{k}-E_{i}T_{k})^{+}
\displaystyle\leq 2kπkEjTi because EjTkEjTi+EiTk\displaystyle 2\sum_{k}\pi_{k}E_{j}T_{i}\mbox{ because }E_{j}T_{k}\leq E_{j}T_% {i}+E_{i}T_{k}
\displaystyle\leq 2σ\displaystyle 2\sigma

and so τ1(3)4σ\tau_{1}^{(3)}\leq 4\sigma.

For the converse, it is elementary that we can find a vertex ii such that the size (n*n^{*}, say) of the largest branch from ii satisfies n*n/2n^{*}\leq n/2. (This is another notion of “centrality”. To be precise, we are excluding ii itself from the branch.) Fix this ii, and consider the jj which maximizes EjTiE_{j}T_{i}, so that EjTiσE_{j}T_{i}\geq\sigma by definition. Let BB denote the set of vertices in the branch from ii which contains jj. Then

EjTk=EjTi+EiTk,  kBcE_{j}T_{k}=E_{j}T_{i}+E_{i}T_{k},\ k\in B^{c}

and so

τ1(3)kπk|EjTk-EiTk|π(Bc)EjTiπ(Bc)σ.\tau_{1}^{(3)}\geq\sum_{k}\pi_{k}|E_{j}T_{k}-E_{i}T_{k}|\geq\pi(B^{c})E_{j}T_{% i}\geq\pi(B^{c})\sigma.

But by (5.89) π(B)=2n*-12(n-1)12\pi(B)=\frac{2n^{*}-1}{2(n-1)}\leq\frac{1}{2}, so we have shown τ1(3)σ/2\tau_{1}^{(3)}\geq\sigma/2.    

We do not know whether τ2\tau_{2} has a simple expression “up to equivalence” analogous to Proposition 5.23. It is natural to apply the “distinguished paths” bound (Chapter 4 yyy). This gives the inequality

τ2\displaystyle\tau_{2} \displaystyle\leq 2(n-1)max(v,w)xA(v,w)yA(w,v)πxπyd(x,y)\displaystyle 2(n-1)\max_{(v,w)}\sum_{x\in A(v,w)}\sum_{y\in A(w,v)}\pi_{x}\pi% _{y}d(x,y)
=\displaystyle= 2(n-1)max(v,w)(π(A(v,w))E[d(v,V)1(VA(w,v))]\displaystyle 2(n-1)\max_{(v,w)}\left(\pi(A(v,w))E\left[d(v,V)1_{(V\in A(w,v))% }\right]\right.
+π(A(w,v))E[d(v,V)1(VA(v,w))])\displaystyle\left.\ \ \ \ \ +\pi(A(w,v))E\left[d(v,V)1_{(V\in A(v,w))}\right]\right)

where VV has the stationary distribution π\pi and where we got the equality by writing d(x,y)=d(v,y)+d(v,x)d(x,y)=d(v,y)+d(v,x). The edge attaining the max gives yet another notion of “centrality.”

xxx further remarks on τ2\tau_{2}.

5.3.2 Extremal trees

It is natural to think of the nn-path (Example 5.8) and the nn-star (Example 5.10) as being “extremal” amongst all nn-vertex trees. The proposition below confirms that the values of τ*\tau^{*}, maxi,jEiTj\max_{i,j}E_{i}T_{j}, τ0\tau_{0}, τ2\tau_{2}, and τc\tau_{c} in those examples are the exact extremal values (minimal for the star, maximal for the path).

Proposition 5.24

For any nn-vertex tree with n3n\geq 3,

(a) 4(n-1)τ*2(n-1)24(n-1)\leq\tau^{*}\leq 2(n-1)^{2}

(b) 2(n-1)maxi,jEiTj(n-1)22(n-1)\leq\max_{i,j}E_{i}T_{j}\leq(n-1)^{2}

(c) n-32τ0(2n2-4n+3)/6n-{\textstyle\frac{3}{2}}\leq\tau_{0}\leq(2n^{2}-4n+3)/6.

(d) 1τ2(1-cos(π/(n-1)))-11\leq\tau_{2}\leq(1-\cos(\pi/(n-1)))^{-1}.

(e) 1-12(n-1)τc4n2/4-2n+12(n-1)1-\frac{1}{2(n-1)}\leq\tau_{c}\leq\frac{4\lfloor n^{2}/4\rfloor-2n+1}{2(n-1)}.

Proof. (a) is obvious from (5.88), because Δ\Delta varies between 22 for the nn-star and (n-1)(n-1) for the nn-path. The lower bound in (b) follows from the lower bound in (a). For the upper bound in (b), consider some path i=v0,v1,,vd=ji=v_{0},v_{1},\ldots,v_{d}=j in the tree, where plainly d(n-1)d\leq(n-1). Now |A(vd-1,vd)|n-1|A(v_{d-1},v_{d})|\leq n-1 and so

|A(vd-i,vd-i+1)|n-i for all i|A(v_{d-i},v_{d-i+1})|\leq n-i\mbox{ for all }i

because the left side decreases by at least 11 as ii increases. So

EiTj\displaystyle E_{i}T_{j} =\displaystyle= m=0d-1EvmTvm+1\displaystyle\sum_{m=0}^{d-1}E_{v_{m}}T_{v_{m+1}}
=\displaystyle= m=0d-1(2|A(vm,vm+1)|-1) by (5.81)\displaystyle\sum_{m=0}^{d-1}(2|A(v_{m},v_{m+1})|-1)\mbox{ by (\ref{tt1})}
\displaystyle\leq m=0d-1(2(m+n-d)-1)\displaystyle\sum_{m=0}^{d-1}(2(m+n-d)-1)
\displaystyle\leq l=1n-1(2l-1)\displaystyle\sum_{l=1}^{n-1}(2l-1)
=\displaystyle= (n-1)2.\displaystyle(n-1)^{2}.

To prove (c), it is enough to show that the sum in Proposition 5.22 is minimized by the nn-star and maximized by the nn-path. For each undirected edge (v,w)(v,w), let

b(v,w)=min(|A(v,w)|,|A(w,v)|)n/2.b(v,w)=\min(|A(v,w)|,|A(w,v)|)\leq n/2.

Let 𝐛=(b1,b2,,bn-1){\bf b}=(b_{1},b_{2},\ldots,b_{n-1}) be the non-decreasing rearrangement of these values b(v,w)b(v,w). The summands in Proposition 5.22 are of the form


with aa ranging over the bib_{i}.

One can check that this quantity is an increasing function of an/2a\leq n/2. Thus it is enough to show that the vector b on an arbitrary nn-tree dominates coordinatewise the vector b for the nn-star and is dominated by the vector b for the nn-path. The former is obvious, since on the nn-star 𝐛=(1,1,,1){\bf b}=(1,1,\ldots,1). The latter needs a little work. On the nn-path 𝐛=(1,1,2,2,3,3,){\bf b}=(1,1,2,2,3,3,\ldots). So we must prove that in any nn-tree

bii+12 for all i.b_{i}\leq\left\lfloor\frac{i+1}{2}\right\rfloor\mbox{ for all }i. (5.91)

Consider a rooted tree on mm vertices. Breaking an edge ee gives two components; let a(e)a(e) be the size of the component not containing the root. Let (a1,a2,)(a_{1},a_{2},\ldots) be the non-decreasing rearrangement of (a(e))(a(e)). For an mm-path rooted at one leaf, (a1,a2,)=(1,2,3,)(a_{1},a_{2},\ldots)=(1,2,3,\ldots). We assert this is extremal, in that for any rooted tree

aii for all i.a_{i}\leq i\mbox{ for all }i. (5.92)

This fact can be proved by an obvious induction on mm, growing trees by adding leaves.

Now consider an unrooted tree, and let b be as above. There exists some vertex vv, of degree r2r\geq 2, such that each of the rr branches from vv has size (excluding vv) at most n/2n/2. Consider these branches as trees rooted at vv, apply (5.92), and it is easy to deduce (5.91).

For (d), the lower bound is easy. Fix a leaf vv and let ww be its neighbor. We want to apply the extremal characterization (Chapter 3 yyy) of τ2\tau_{2} to the function

g(v)=1-πv-πw,g(w)=0,g()=-πv elsewhere.g(v)=1-\pi_{v}-\pi_{w},g(w)=0,g(\cdot)=-\pi_{v}\mbox{ elsewhere}.

For this function, πxg(x)=0\sum\pi_{x}g(x)=0,


and by considering transitions out of ww

(g,g)=πv(1-πv-πw)2+(πw-πv)πv2.\mbox{${\cal E}$}(g,g)=\pi_{v}(1-\pi_{v}-\pi_{w})^{2}+(\pi_{w}-\pi_{v})\pi^{2}% _{v}.

Since πw1/2\pi_{w}\leq 1/2 we have [g,g](g,g)[g,g]\geq\mbox{${\cal E}$}(g,g) and hence τ2[g,g]/(g,g)1\tau_{2}\geq[g,g]/\mbox{${\cal E}$}(g,g)\geq 1.

qqq Anyone have a short proof of upper bound in (d)?

Finally, (e) is clear from (5.90).    

Other extremal questions. Several other extremal questions have been studied. Results on cover time are given in Chapter 6. Yaron [340] shows that for leaves ll the mean hitting time EπTlE_{\pi}T_{l} is maximal on the nn-path and minimal on the nn-star. (He actually studies the variance of return times, but Chapter 2 yyy permits the rephrasing.) Finally, if we are interested in the mean hitting time ExTAE_{x}T_{A} or the hitting place distribution, we can reduce to the case where AA is the set LL of leaves, and then set up recursively-solvable equations for h(i)EiTLh(i)\equiv E_{i}T_{L} or for f(i)=Pi(TA=Tl)f(i)=P_{i}(T_{A}=T_{l}) for fixed lLl\in L. An elementary treatment of such ideas is in Pearce [277], who essentially proved that (on nn-vertex trees) maxxExTL\max_{x}E_{x}T_{L} is minimized by the nn-star and maximized by the nn-path.