The purpose of this section is to give a systematic “probabilistic” treatment of a collection of general identities by deriving them from a single result, Proposition 2.3. We work in discrete time, but give the corresponding continuous-time results in section 2.2.3. Intuitively, a stopping time is a random time which can be specified by some on-line algorithm, together (perhaps) with external randomization.
Consider the chain started at state . Let be a stopping time such that and . Let be an arbitrary state. Then
In the phrase “number of …before time ”, our convention is to include time but exclude time .
We shall give two different proofs. The first requires a widely-useful general theorem in stochastic processes.
Proof. Consider the renewal process whose inter-renewal time is distributed as . The reward-renewal theorem (e.g. Ross [300] Thm. 3.6.1) says that the asymptotic proportion of time spent in state equals
But this asymptotic average also equals , by the ergodic theorem.
We like that proof for philosophical reasons: a good way to think about general identities is that they show one quantity calculated in two different ways. Here is an alternative proof of a slightly more general assertion. We refer to Propositions 2.3 and 2.4 as occupation measure identities.
Let be a probability distribution on . Let be a stopping time such that and . Let be an arbitrary state. Then
Proof. Write . We will show
(2.5) |
Then by uniqueness of the stationary distribution, for .
Checking (2.5) is just a matter of careful notation.
The following series of formulas arise from particular choices of and in Proposition 2.3. For ease of later reference, we state them all together before starting the proofs. Some involve the quantity
(2.6) |
In the periodic case the sum may oscillate, so we use the Cesaro limit or (equivalently, but more simply) the continuous-time limit (2.9). The matrix is called the fundamental matrix (see Notes for alternate standardizations). Note that from the definition
(2.7) |
.
For ,
For ,
For and arbitrary ,
For and ,
.
.
for each .
does not depend on .
Lemmas 2.11 and 2.12, which will be used frequently throughout the book, will both be referred to as the mean hitting time formula. See the Remark following the proofs for a two-line heuristic derivation of Lemma 2.12. A consequence of the mean hitting time formula is that knowing the matrix is equivalent to knowing the matrix , since we can recover as .
Proofs. The simplest choice of in Proposition 2.3 is of course the first return time . With this choice, the Proposition says
Setting gives , which is Lemma 2.5, and then the case of general gives Lemma 2.6.
Another choice of is “the first return to after the first visit to ”. Then and the Proposition becomes Lemma 2.7, because there are no visits to before time . For the chain started at , the number of visits to (including time ) before hitting has geometric distribution, and so
So Corollary 2.8 follows from Lemma 2.7 (with and interchanged).
Another choice of is “the first return to after the first visit to after the first visit to ”, where are distinct. The Proposition says
Lemma 2.7 gives an expression for the final expectation, and we deduce that (for distinct )
This is the assertion of Lemma 2.9, and the identity remains true if (where it becomes Lemma 2.7) or if (where it reduces to ). We deduce Corollary 2.10 by writing
and using Lemma 2.7 to evaluate the final expectation.
We now get slightly more ingenious. Fix a time and define as the time taken by the following -stage procedure (for the chain started at ).
(i) wait time
(ii) then wait (if necessary) until the chain next hits .
Then the Proposition (with ) says
(2.8) |
where . Rearranging,
Letting we have by the convergence theorem (strictly, we should give a separate argument for the periodic case, but it’s simpler to translate the argument to continuous time where the periodicity issue doesn’t arise) and we obtain Lemma 2.11.
For Lemma 2.12, where we may take , we combine the previous ideas. Again fix and define as the time taken by the following -stage procedure (for the chain started at ).
(i) wait until the chain hits .
(ii) then wait a further time .
(iii) then wait (if necessary) until the chain next hits .
Applying Proposition 2.3 with this and with gives
where . Subtracting the equality of Lemma 2.7 and rearranging, we get
Letting , we have (as above) , giving
Appealing to Lemma 2.11 we get Lemma 2.12. Corollary 2.13 follows from Lemma 2.12 by using (2.7).
To prove Lemma 2.15, consider again the argument for (2.8), but now apply the Proposition with . This gives
where . Rearranging,
Letting gives
Remark. We promised a two-line heuristic derivation of the mean hitting time formula, and here it is. Write
Take of each term to get . Of course this argument doesn’t make sense because the sums do not converge. Implicit in our (honest) proof is a justification of this argument by a limiting procedure.
Patterns in coin-tossing.
This is a classical example for which is easy to calculate. Fix . Toss a fair coin repeatedly, and let be the successive overlapping -tuples. For example (with )
So is a Markov chain on the set of -tuples , and the stationary distribution is uniform on . For write for the indicator of the set “pattern , shifted right by places, agrees with pattern where they overlap”: formally, of the set
For example, with , and ,
Then write
From the definition of , and the fact that and are independent for ,
So we can read off many facts about patterns in coin-tossing from the general results of this section. For instance, the mean hitting time formula ( Lemma 2.11) says . Note that “time 0” for the chain is the ’th toss, at which point the chain is in its stationary distribution. So the mean number of tosses until first seeing pattern equals . For and , the reader may check this mean number is . We leave the interested reader to explore further — in particular, find three patterns such that
Further results. One can of course obtain expressions in the spirit of Lemmas 2.5–2.15 for more complicated quantities. The reader may care to find expressions for
Warning. Hitting times on subsets will be studied later (e.g. Chapter 3 section 5.3) (yyy 9/2/94 version) in the reversible setting. It is important to note that results often do not extend simply from singletons to subsets. For instance, one might guess that Lemma 2.11 could be extended to
but it is easy to make examples where this is false.
Here we record the continuous-time versions of the results of the previous section. Write
(2.9) |
This is consistent with (2.6) in that is the same for a discrete-time chain and its continuized chain. Recall from section 2.1.2 the redefinition (b) of in continuous time. In place of “number of visits to ” we use “total duration of time spent in ”. With this substitution, Proposition 2.3 and the other results of the previous section extend to continuous time with only the following changes, which occur because the mean sojourn time in a state is in continuous time, rather than as in discrete time.
Lemma 2.5..
Lemma 2.6.
Corollary 2.8. For ,