3 Reversible Markov Chains (September 10, 2002)3.3 Electrical networks 3.5 Complete monotonicity

3.4 The spectral representation

Use the transition matrix ${\bf P}$ to define

s_{ij}=\pi^{1/2}_{i}p_{ij}\pi^{-1/2}_{j}.

From definition (3.1), ${\bf S}$ is a symmetric matrix. So we can apply the elementary diagonalization theorem. The authors find it helpful to distinguish between the state space $I=\{i,j,\ldots\}$ , of size $n$ say, and the index set of integers $[n]=\{1,2,\ldots,n\}$ , the point being that the state space may have a lot of extra structure, whereas the index set has no obvious structure. The spectral theorem ([183] Theorem 4.1.5) gives a representation

{\bf S}={\bf U}\Lambda{\bf U}^{T}

where ${\bf U}=(u_{im})_{i\in I,m\in[n]}$ is an orthonormal matrix, and $\Lambda=(\lambda_{m,m^{\prime}})_{m,m^{\prime}\in[n]}$ is a diagonal real matrix. We can write the diagonal entries of $\Lambda$ as $(\lambda_{m})$ , and arrange them in decreasing order. Then

1=\lambda_{1}>\lambda_{2}\geq\cdots\geq\lambda_{n}\geq-1.

(3.30)

The classical fact that $|\lambda_{i}|\leq 1$ follows easily from the fact that the entries of ${\bf S}^{(t)}$ are bounded as $t\rightarrow\infty$ by (3.31) below. These $\lambda$ ’s are the eigenvalues of ${\bf P}$ , as well as of ${\bf S}$ . That is, the solutions $(\lambda;x)$ with $x_{i}\not\equiv 0$ of

\sum_{i}x_{i}p_{ij}=\lambda x_{j}\mbox{\ \ for all\ }j

are exactly the pairs

(\lambda=\lambda_{m};x_{i}=c_{m}\pi^{1/2}_{i}u_{im},i=1,\ldots,n)

for $m=1,\ldots,n$ , where $c_{m}\neq 0$ is arbitrary. And the solutions of

\sum_{j}p_{ij}y_{j}=\lambda y_{i}\mbox{\ \ for all\ }i

are exactly the pairs

(\lambda=\lambda_{m};\ \,y_{i}=c_{m}\pi^{-1/2}_{i}u_{im},\ i=1,\ldots,n).

Note that an eigenvector $(u_{i1})$ of ${\bf S}$ corresponding to the eigenvalue $\lambda_{1}=1$ is

u_{i1}=\pi^{1/2}_{i}.

Uniqueness of the stationary distribution now implies $\lambda_{2}<1$ .

Now consider matrix powers. We have

{\bf S}^{(t)}={\bf U}\Lambda^{(t)}{\bf U}^{T}

and

p^{(t)}_{ij}=\pi^{-1/2}_{i}s^{(t)}_{ij}\pi^{1/2}_{j},

(3.31)

P_{i}(X_{t}=j)=\pi^{-1/2}_{i}\pi^{1/2}_{j}\sum_{m=1}^{n}\lambda^{t}_{m}u_{im}u% _{jm}.

(3.32)

This is the spectral representation formula. In continuous time, the analogous formula is

P_{i}(X_{t}=j)=\pi^{-1/2}_{i}\pi^{1/2}_{j}\sum_{m=1}^{n}\exp(-\lambda_{m}t)u_{% im}u_{jm}.

(3.33)

As before, ${\bf U}$ is an orthonormal matrix and $u_{i1}=\pi^{1/2}_{i}$ , and now the $\lambda$ ’s are the eigenvalues of $-{\bf Q}$ . In the continuous-time setting, the eigenvalues satisfy

0=\lambda_{1}<\lambda_{2}\leq\cdots\leq\lambda_{n}.

(3.34)

Rather than give the general proof, let us consider the effect of continuizing the discrete-time chain (3.32). The continuized chain $(Y_{t})$ can be represented as $Y_{t}=X_{N(t)}$ where $N(t)$ has $\mbox{Poisson}(t)$ distribution, so by conditioning on $N(t)=\nu$ ,

	$\displaystyle P_{i}(Y_{t}=j)$	$\displaystyle=$	$\displaystyle\pi^{-1/2}_{i}\pi^{1/2}_{j}\sum_{m=1}^{n}u_{im}u_{jm}\sum_{\nu=0}% ^{\infty}\lambda_{m}^{\nu}\frac{e^{-t}t^{\nu}}{\nu!}$
		$\displaystyle=$	$\displaystyle\pi^{-1/2}_{i}\pi^{1/2}_{j}\sum_{m=1}^{n}u_{im}u_{jm}\exp(-(1-% \lambda_{m})t).$

So when we compare the spectral representations (3.32),(3.33) for a discrete-time chain and its continuization, the orthonormal matrices are identical, and the eigenvalues are related by

\lambda^{\mbox{\rm\scriptsize(c)}}_{m}=1-\lambda^{\mbox{\rm\scriptsize(d)}}_{m}

(3.35)

superscripts (c) and (d) indicating continuous or discrete time. In particular, this relation holds for the basic discrete and continuous time random walks on a graph.

Let us point out some interesting simple consequences of the spectral representation. For these purposes continuous time is simpler. First,

P_{i}(X_{t}=j)-\pi_{j}=c_{ij}e^{-\lambda_{2}t}+o(e^{-\lambda_{2}t})\mbox{\ \ % as\ }t\rightarrow\infty

(3.36)

where $c_{ij}=\pi^{-1/2}_{i}\pi^{1/2}_{j}\sum_{m:\lambda_{m}=\lambda_{2}}u_{im}u_{jm}$ and where “typically” $c_{ij}\neq 0$ . (A precise statement is this: there exists $i$ such that

P_{i}(X_{t}=i)-\pi_{i}\sim c_{ii}e^{-\lambda_{2}t},\ \ c_{ii}>0,

(3.37)

by considering $i$ such that $u_{i2}\neq 0$ .) Thus $\lambda_{2}$ has the interpretation of “asymptotic rate of convergence to the stationary distribution”. The authors find it simpler to interpret parameters measuring “time” rather than “ $1/\mbox{time}$ ”, and so prefer to work with the relaxation time $\tau_{2}$ defined by

	$\displaystyle\tau_{2}$	$\displaystyle:=$	$\displaystyle 1/\lambda_{2}\mbox{\ \ for a continuous-time chain}$		(3.38)
	$\displaystyle\tau_{2}$	$\displaystyle:=$	$\displaystyle 1/(1-\lambda_{2})\mbox{\ \ for a discrete-time chain.}$		(3.39)

Note that by (3.35) the value of $\tau_{2}$ is unchanged by continuizing a discrete-time chain.

Still in continuous time, the spectral representation gives

P_{i}(X_{t}=i)=\pi_{i}+\sum_{m\geq 2}u^{2}_{im}\exp(-\lambda_{m}t)

(3.40)

so the right side is decreasing with $t$ , and in fact is completely monotone, a subject pursued in Section 3.5. Thus $Z_{ii}$ defined in ^†^†margin: 9/10/99 versionChapter 2 Section 2.3 satisfies

	$\displaystyle Z_{ii}$	$\displaystyle=$	$\displaystyle\int_{0}^{\infty}\!(P_{i}(X_{t}=i)-\pi_{i})\,dt$		(3.41)
		$\displaystyle=$	$\displaystyle\sum_{m\geq 2}u^{2}_{im}\lambda^{-1}_{m}\mbox{\ \ by~{}(\ref{pii}% )}.$		(3.41)

Using the orthonormal property of ${\bf U}$ ,

\sum_{i}Z_{ii}=\sum_{m\geq 2}\lambda^{-1}_{m}.

Applying Corollary 13 of ^†^†margin: 9/10/99 versionChapter 2, we obtain a fundamental result relating average hitting times to eigenvalues.

Proposition 3.13 (The eigentime identity)

For each $i$ ,

	$\displaystyle\sum_{j}\pi_{j}E_{i}T_{j}$	$\displaystyle=$	$\displaystyle\sum_{m\geq 2}\lambda^{-1}_{m}\mbox{\ \ {\ \ \ \ \ \ \ \ \rm(}% continuous time\/{\rm)}}$
	$\displaystyle\sum_{j}\pi_{j}E_{i}T_{j}$	$\displaystyle=$	$\displaystyle\sum_{m\geq 2}(1-\lambda_{m})^{-1}\mbox{\ \ {\rm(}discrete time\/% {\rm)}}.$

[The discrete-time version follows from (3.35).] Proposition 3.13 expands upon the random target lemma, which said that (even for non-reversible chains) $\sum_{j}\pi_{j}E_{i}T_{j}$ does not depend on $i$ .

3.4.1 Mean hitting times and reversible chains

In ^†^†margin: 9/10/99 versionChapter 2 Section 2.2 we listed identities for general chains such as the mean hitting time formulas

E_{i}T_{j}=(Z_{jj}-Z_{ij})/\pi_{j};\ \ \ E_{\pi}T_{j}=Z_{jj}/\pi_{j}.

There are a number of more complicated identities for general chains in which one side becomes zero for any reversible chain (by the symmetry property $\pi_{i}Z_{ij}=\pi_{j}Z_{ji}$ ) and which therefore simplify to give identities for reversible chains. We have already seen one example, the cyclic tour lemma, and the following result may be considered an extension of that lemma. [Indeed, sum the following equation over successive pairs $(i,j)$ along a cycle to recapture the cyclic tour lemma.]

Corollary 3.14

$E_{\pi}T_{j}-E_{\pi}T_{i}=E_{i}T_{j}-E_{j}T_{i}$ .

This identity follows immediately from the mean hitting time formulas and the symmetry property. Note the following interpretation of the corollary. Define an ordering $i\preceq j$ on the states by

i\preceq j\mbox{\ \ iff\ \ }E_{\pi}T_{i}\leq E_{\pi}T_{j}.

Then Corollary 3.14 implies

E_{i}T_{j}\geq E_{j}T_{i}\mbox{\ \ iff\ \ }i\preceq j.

Warning. Corollary 3.14 does not imply

	$\displaystyle\hskip{-119.2455pt}\max_{i,j}E_{i}T_{j}\mbox{\ is attained by some pair\ }(i_{},j_{})\mbox{\ such that}$
	$\displaystyle\hskip{-119.2455pt}\quad i_{}\mbox{\ attains\ }\min_{i}E_{\pi}T_% {i}\mbox{\ and\ }j_{}\mbox{\ attains\ }\max_{j}E_{\pi}T_{j}.$

Here ^†^†margin: I haven’t tried to find counterexamples with more than three states. is a counterexample. Choose $0<\varepsilon<1/2$ arbitrarily and let

{\bf P}:=\left[\begin{array}[]{ccc}2\varepsilon&1-2\varepsilon&0\\ \varepsilon&1-2\varepsilon&\varepsilon\\ 0&1-2\varepsilon&2\varepsilon\end{array}\right].

We invite the reader to perform the computations necessary to verify that ${\bf P}$ is reversible with $\pi=[\varepsilon,1-2\varepsilon,\varepsilon]$ and

(E_{i}T_{j})=\varepsilon^{-1}(1-2\varepsilon)^{-1}\left[\begin{array}[]{ccc}0&% \varepsilon&1\\ 1-\varepsilon&0&1-\varepsilon\\ 1&\varepsilon&0\end{array}\right],

so that $(E_{\pi}T_{i})=\varepsilon^{-1}(1-2\varepsilon)^{-1}[1-2\varepsilon+2% \varepsilon^{2},2\varepsilon^{2},1-2\varepsilon+2\varepsilon^{2}]$ . Thus $E_{\pi}T_{i}$ is minimized uniquely by $i^{*}=2$ , while $\max_{i,j}E_{i}T_{j}$ is attained only by the pairs $(1,3)$ and $(3,1)$ .

As a second instance of what reversibility implies, note, from (3.33) and the definition of $Z_{ij}$ , that

Z_{ij}=\pi^{-1/2}_{i}\pi^{1/2}_{j}\sum_{m\geq 2}\lambda^{-1}_{m}u_{im}u_{jm}.

This implies

\mbox{the symmetrized matrix\ }\pi^{1/2}_{i}Z_{ij}\pi^{-1/2}_{j}\mbox{\ is % positive semidefinite.}

(3.42)

Note that a symmetric positive semidefinite matrix $(M_{ij})$ has the property $M^{2}_{ij}\leq M_{ii}M_{jj}$ . This gives

Z^{2}_{ij}\leq Z_{ii}Z_{jj}\pi_{j}/\pi_{i},

(3.43)

which enables us to upper-bound mean hitting times from arbitrary starts in terms of mean hitting times from stationary starts.

Lemma 3.15

$\max_{ij}E_{i}T_{j}\leq 2\max_{k}E_{\pi}T_{k}$ .

Proof. Using (3.43),

(Z_{ij}/\pi_{j})^{2}\leq(Z_{ii}/\pi_{i})\ (Z_{jj}/\pi_{j})

and so

-Z_{ij}/\pi_{j}\leq\max_{k}Z_{kk}/\pi_{k}.

So the mean hitting time formula gives the two equalities in

E_{i}T_{j}=\frac{Z_{jj}}{\pi_{j}}-\frac{Z_{ij}}{\pi_{j}}\leq 2\max_{k}\frac{Z_% {kk}}{\pi_{k}}=2\max_{k}E_{\pi}T_{k}.\qquad\qquad~{}\ \ \rule{4.3pt}{4.3pt}

3.3 Electrical networks Bibliography 3.5 Complete monotonicity

Generated on Mon Jun 2 14:23:48 2014 by LaTeXML [LOGO]