Here we record the behavior of our parameters under two natural operations on chains.
Given a Markov chain on state space and a function , the process is typically not a Markov chain. But we can invent a chain which substitutes. In discrete time (the continuous case is similar) define the induced chain to have transition matrix
(4.39) |
More informatively, we are matching the stationary flow rates:
(4.40) |
The reader may check that (4.39) and (4.40) are equivalent. Under our standing assumption that is reversible, the induced chain is also reversible (though the construction works for general chains as well). In the electrical network interpretation, we are shorting together vertices with the same -values. It seems intuitively plausible that this “shorting” can only decrease our parameters describing convergence and mean hitting time behavior.
The values of
and in an induced chain
are less than or equal to the corresponding values in the original chain.
Proof. A function pulls back to a function . So the Dirichlet principle (Chapter 3 yyy) shows that mean commute times can only decrease when passing to an induced chain:
This establishes the assertion for and , and the extremal characterization of relaxation time works similarly for . The assertion about is immediate from the definition, since a partition of pulls back to a partition of .
On the other hand, it is easy to see that shorting may increase a one-sided mean hitting time. For example, random walk on the unweighted graph on the left has , but when we short together to form vertex in the graph on the right, .
Finally, the behavior of the -family under shorting is unclear.
Is the value of in an induced chain bounded by times the value of in the original chain, for some absolute constant ? For ?
Given Markov chains on state spaces and , there is a natural concept of a “product chain” on state space . It is worth writing this concept out in detail for two reasons. First, to prevent confusion between several different possible definitions in discrete time. Second, because the behavior of relaxation times of product chains is relevant to simple examples and has a surprising application (section 4.6.3).
As usual, things are simplest in continuous time. Define the product chain to be
where the components and are independent versions of the given chains. So
(4.41) |
Using the interpretation of relaxation time as asymptotic rate of convergence of transition probabilities, (Chapter 3 yyy) it is immediate that has relaxation time
(4.42) |
In discrete time there are two different general notions of “product chain”. One could consider the chain whose coordinates are the independent chains. This is the chain with transition probabilities
and has relaxation time
But it is more natural to define the product chain to be the chain with transition probabilities
This is the jump chain derived from the product of the continuized chains, and has relaxation time
(4.43) |
Again, this can be seen without need for calculation: the continuized chain is just the continuous-time product chain run at half speed.
This definition and (4.43) extend to -fold products in the obvious way. Random walk on is the product of copies of random walk on , and random walk on the -cube (Chapter 5 yyy) is the product of copies of random walk on .
Just to make things more confusing, given graphs and the Cartesian product graph is defined to have edges
If both and are -regular then random walk on the product graph is the product of the random walks on the individual graphs. But in general, discrete-time random walk on the product graph is the jump chain of the product of the fluid model (Chapter 3 yyy) continuous-time random walks. So if the graphs are - and -regular then the discrete-time random walk on the product graph has the product distribution as its stationary distribution and has relaxation time
But for non-regular graphs, neither assertion is true.
Let us briefly discuss the behavior of some other parameters under products. For the continuous-time product (4.41), the total variation distance of section 4.3 satisfies
and we deduce the crude bound
where superscripts refer to the graphs and not to the parameters in section 4.3.1. For the discrete-time chain, there is an extra factor of 2 from “slowing down” (cf. (4.42,4.43)), leading to
Here our conventions are a bit confusing: this inequality refers to the discrete-time product chain, but as in section 4.3 we define via the continuized chain – we leave the reader to figure out the analogous result for discussed in section 4.3.3.
To state a result for , consider the continuous-time product of independent copies of the same -state chain. If the underlying chain has eigenvalues then the product chain has eigenvalues and so by the eigentime identity
Thus in discrete time
(4.44) |
The results above concerning relaxation times of product chains are essentially obvious using the interpretation of relaxation time as asymptotic rate of convergence of transition probabilities, but they are much less obvious using the extremal interpretation. Indeed, consider the -fold product of a single chain with itself. Write for the distribution at times and of , and for the relaxation time of . Combining (4.43) with the extremal characterization (4.22) of the relaxation time for the product chain, a brief calculation gives the following result.
Let be arbitrary. Let , be independent copies of . Let and let . Then
To appreciate this, consider the “trivial” case where the underlying Markov chain is just an i.i.d. sequence with distribution on . Then and the random variables are i.i.d. with distribution . And this special case of Corollary 4.46 becomes (4.45) below, because for each the distribution of is unchanged by substituting for .
Let be arbitrary. Let be i.i.d. with distribution . Let and let . Then
(4.45) |
If is symmetric then
(4.46) |
where and .
Note that in the symmetric case we may rewrite
This reveals (4.46) to be the celebrated Efron-Stein inequality in statistics, and in fact (4.45) is a known variant (see Notes).
Proof. As observed above, (4.45) is a special case of Corollary 4.46. So it is enough to show that for symmetric the right sides of (4.45) and (4.46) are equal. Note that by symmetry does not depend on , and does not depend on , for . So the right side of (4.45) is
But it is easy to calculate
and then the right side of (4.46) equals
The choice of parameters studied in this chapter is partly arbitrary, but our choice has been guided by two criteria, one philosophical and one technical. The philosophical criterion is
when formalizing a vague idea, choose a definition which has several equivalent formulations.
This is why we used the maximal mean hitting time parameter instead of , because the former permits the equivalent “resistance” interpretation.
Here is the technical criterion. Given a continuous-time chain and a state , create a new chain by splitting into two states and setting
with elsewhere. Then , with elsewhere. As , we may regard the new chain as converging to the old chain in a certain sense. So our technical criterion for parameters is that the value of for should converge, as , to the value for . It is easy to check this holds for the ’s we have studied, but it does not hold for, say,
which at first sight might seem a natural parameter.