2 General Markov Chains (September 10, 1999)2.4 Two metrics on distributions 2.6 Matthews’ method for cover times

2.5 Distributional identities

It is much harder to get useful information about distributions (rather than mere expectations). Here are a few general results.

2.5.1 Stationarity consequences

A few useful facts about stationary Markov chains are, to experts, just specializations of facts about arbitrary (i.e. not-necessarily-Markov) stationary processes. Here we give a bare-hands proof of one such fact, the relation between the distribution of return time to a subset $A$ and the distribution of first hitting time to $A$ from a stationary start. We start in discrete time.

Lemma 2.23

For $t=1,2,\ldots$ ,

P_{\pi}(T_{A}=t-1)=P_{\pi}(T^{+}_{A}=t)=\pi(A)P_{\pi_{A}}(T^{+}_{A}\geq t)

where $\pi_{A}(i):=\pi_{i}/\pi(A),\ i\in A$ .

Proof.The first equality is obvious. Now let $(X_{t})$ be the chain started with its stationary distribution $\pi$ . Then

$\displaystyle P_{\pi}(T^{+}_{A}=t)$	$\displaystyle=$	$\displaystyle P(X_{1}\not\in A,\ldots,X_{t-1}\not\in A,X_{t}\in A)$
	$\displaystyle=$	$\displaystyle P(X_{1}\not\in A,\ldots,X_{t-1}\not\in A)-P(X_{1}\not\in A,% \ldots,X_{t}\not\in A)$
	$\displaystyle=$	$\displaystyle P(X_{1}\not\in A,\ldots,X_{t-1}\not\in A)-P(X_{0}\not\in A,% \ldots,X_{t-1}\not\in A)$
	$\displaystyle=$	$\displaystyle P(X_{0}\in A,X_{1}\not\in A,\ldots,X_{t-1}\not\in A)$
	$\displaystyle=$	$\displaystyle\pi(A)P_{\pi_{A}}(T^{+}_{A}\geq t),$

establishing the Lemma.

We’ll give two consequences of Lemma 2.23. Summing over $t$ gives

Corollary 2.24 (Kac’s formula)

$\pi(A)E_{\pi_{A}}T^{+}_{A}=1$

which extends the familiar fact $E_{i}T^{+}_{i}=1/\pi_{i}$ . Multiplying the identity of Lemma 2.23 by $t$ and summing gives

$\displaystyle E_{\pi}T_{A}+1$	$\displaystyle=$	$\displaystyle\sum_{t\geq 1}tP_{\pi_{A}}(T_{A}=t-1)$
	$\displaystyle=$	$\displaystyle\pi(A)\sum_{t\geq 1}tP_{\pi_{A}}(T^{+}_{A}\geq t)$
	$\displaystyle=$	$\displaystyle\pi(A)\sum_{m\geq 1}\frac{1}{2}m(m+1)P_{\pi_{A}}(T^{+}_{A}=m)$
	$\displaystyle=$	$\displaystyle\frac{\pi(A)}{2}\left(E_{\pi_{A}}T^{+}_{A}+E_{\pi_{A}}(T^{+}_{A})% ^{2}\right).$

Appealing to Kac’s formula and rearranging,

	$\displaystyle E_{\pi_{A}}(T^{+}_{A})^{2}$	$\displaystyle=$	$\displaystyle\frac{2E_{\pi}T_{A}+1}{\pi(A)},$		(2.21)
	$\displaystyle\mbox{var}_{\pi_{A}}(T^{+}_{A})$	$\displaystyle=$	$\displaystyle\frac{2E_{\pi}T_{A}+1}{\pi(A)}-\frac{1}{\pi^{2}(A)}.$		(2.22)

More generally, there is a relation between $E_{\pi_{A}}(T^{+}_{A})^{p}$ and $E_{\pi}(T^{+}_{A})^{p-1}$ .

In continuous time, the analog of Lemma 2.23 is

P_{\pi}(T_{A}\in(t,t+dt))=Q(A,A^{c})P_{\rho_{A}}(T_{A}>t)dt,\ t>0

(2.23)

where

Q(A,A^{c}):=\sum_{i\in A}\sum_{j\in A^{c}}q_{ij},\ \ \ \rho_{A}(j):=\sum_{i\in A% }q_{ij}/Q(A,A^{c}),j\in A^{c}.

Integrating over $t>0$ gives the analog of Kac’s formula

Q(A,A^{c})E_{\rho_{A}}T_{A}=\pi(A^{c}).

(2.24)

2.5.2 A generating function identity

Transform methods are useful in analyzing special examples, though that is not the main focus of this book. We record below just the simplest “transform fact”. We work in discrete time and use generating functions – the corresponding result in continuous time can be stated using Laplace transforms.

Lemma 2.25

Define

G_{ij}(z)=\sum_{t\geq 0}P_{i}(X_{t}=j)z^{t},\ F_{ij}(z)=\sum_{t\geq 0}P_{i}(T_% {j}=t)z^{t}.

Then $F_{ij}=G_{ij}/G_{jj}.$

Analysis proof. Conditioning on $T_{j}$ gives

p^{(t)}_{ij}=\sum_{l=0}^{t}P_{i}(T_{j}=l)p^{(t-l)}_{jj}

and so

\sum_{t\geq 0}p^{(t)}_{ij}z^{t}=\sum_{l\geq 0}\sum_{t-l\geq 0}P_{i}(T_{j}=l)z^% {l}p^{(t-l)}_{jj}z^{t-l}

Thus $G_{ij}(z)=F_{ij}(z)G_{jj}(z)$ , and the lemma follows. $\Box$

Probability proof. Let $\zeta$ have geometric $(z)$ law $P(\zeta>t)=z^{t}$ , independent of the chain. Then

$\displaystyle G_{ij}(z)$	$\displaystyle=$	$j$
	$\displaystyle=$	$j$
	$\displaystyle=$	$\displaystyle F_{ij}(z)G_{jj}(z).$

$\Box$

Note that, differentiating term by term,

E_{i}T_{j}=\left.{\textstyle\frac{d}{dz}}F_{ij}(z)\right|_{z=1}.

This and Lemma 2.25 can be used to give an alternative derivation of the mean hitting time formula, Lemma 2.12.

2.5.3 Distributions and continuization

The distribution at time $t$ of the continuization $\hat{X}$ of a discrete-time chain $X$ is most simply viewed as a Poisson mixture of the distributions $(X_{s})$ . That is, $\hat{X}_{t}\ \stackrel{d}{=}\ X_{N_{t}}$ where $N_{t}$ has Poisson $(t)$ distribution independent of $X$ . At greater length,

P_{i}(\hat{X}_{t}=j)=\sum_{s=0}^{\infty}\frac{e^{-t}t^{s}}{s!}P_{i}(X_{s}=j).

This holds because we can construct $\hat{X}$ from $X$ by replacing the deterministic “time $1$ ” holds by random, exponential $(1)$ , holds $(\xi_{j})$ between jumps, and then the number $N_{t}$ of jumps before time $t$ has Poisson $(t)$ distribution. Now write $S_{n}=\sum_{j=1}^{n}\xi_{j}$ for the time of the $n$ ’th jump. Then the hitting time $\hat{T}_{A}$ for the continuized chain is related to the hitting time $T_{A}$ of the discrete-time chain by $\hat{T}_{A}=S_{T_{A}}$ . Though these two hitting time distributions are different, their expectations are the same, and their variances are related in a simple way. To see this, the conditional distribution of $\hat{T}_{A}$ given $T_{A}$ is the distribution of the sum of $T_{A}$ independent $\xi$ ’s, so (using the notion of conditional expectation given a random variable)

E(\hat{T}_{A}|T_{A})=T_{A},\ {\rm var}\ (\hat{T}_{A}|T_{A})=T_{A}.

Thus (for any initial distribution)

E\hat{T}_{A}=EE(\hat{T}_{A}|T_{A})=ET_{A}.

And the conditional variance formula ([133] p. 198)

{\rm var}\ Z=E\ {\rm var}\ (Z|Y)+{\rm var}\ E(Z|Y)

tells us that

	$\displaystyle{\rm var}\ \ \hat{T}_{A}$	$\displaystyle=$	$\displaystyle E{\rm var}\ (\hat{T}_{A}\|T_{A})+{\rm var}\ E(\hat{T}_{A}\|T_{A})$		(2.25)
		$\displaystyle=$	$\displaystyle ET_{A}+{\rm var}\ T_{A}.$		(2.25)

2.4 Two metrics on distributions Bibliography 2.6 Matthews’ method for cover times

Generated on Mon Jun 2 14:23:48 2014 by LaTeXML [LOGO]