8.2.1 norms and operator norms
Our discussion here of norms will parallel and extend the discussion in
Chapter 2 Section yyy:6.2 of and norms. Given , both the norm of a function and the norm of a signed
measure are defined with respect to some fixed reference probability
distribution on , which for our purposes will be the stationary
distribution of some irreducible but not necessarily reversible chain under
consideration. For , the norm of a function
is
|
|
|
and we define the norm of a signed measure on to be the
norm of its density function with respect to :
|
|
|
For , the corresponding definitions are
|
|
|
and
|
|
|
Any matrix operates on functions by
left-multiplication:
|
|
|
(8.27) |
and on signed measures by right-multiplication:
|
|
|
(8.28) |
For (8.27), fix and and
regard as a linear operator mapping into . The
operator norm is defined by
|
|
|
(8.29) |
The sup in (8.29) is always achieved, and there are many
equivalent reexpressions, including
|
|
|
|
|
|
|
|
|
|
Note also that
|
|
|
(8.30) |
For (8.28), we may similarly regard as a linear operator mapping
signed measures , measured by , to signed measures , measured by . The corresponding definition
of operator norm, call it , is then
|
|
|
A brief calculation shows that
|
|
|
where is the matrix with entry , that
is, is the adjoint operator to (with respect to ).
Our applications in this chapter will all have , so we will not
need to distinguish between the two operator norms. In fact, all our
applications will take to be either or for some , where
|
|
|
xxx notation found elsewhere in book?
and is the transition matrix for the trivial
discrete time chain that jumps in one step to stationarity:
|
|
|
and where we assume that the chain for is reversible. Note
that operates on functions essentially as expectation with respect
to :
|
|
|
The effect of on signed measures is to map to ,
and
|
|
|
(8.31) |
8.2.2 A more general bound on distance
The following preliminary result, a close relative to Chapter 3, Lemmas yyy:21
and 23, is used frequently enough in the sequel that we isolate it for
reference. It is the simple identity in part (b) that shows why -based
techniques are so useful.
Lemma 8.10
(a) For any function ,
|
|
|
(b)
|
|
|
Proof.
(a) Using the backward equations
|
|
|
we find
|
|
|
and so
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(b) From (a), for any we have
|
|
|
which yields
|
|
|
|
|
|
|
|
|
|
Thus . Taking to be the
eigenvector
|
|
|
of corresponding to eigenvalue demonstrates
equality and completes the proof of (b).
The key to all further developments in this chapter is the following result.
Lemma 8.11
For an irreducible reversible chain with arbitrary initial distribution and
any ,
|
|
|
Proof.
The equality is Lemma 8.10(b), and
|
|
|
proves the inequality.
We have already discussed, in Section 8.1, a technique for
bounding when (as is usually the case) it cannot be computed
exactly. To utilize Lemma 8.11, we must also bound . Since
|
|
|
(8.32) |
xxx For NOTES: By Jensen’s inequality (for ), any
transition matrix contracts for any .
and each is contractive on , i.e., (this follows, for example, from Lemma 8.10(a); and note that by considering constant functions), it follows that
|
|
|
(8.33) |
and the decrease is strictly monotone unless .
From (8.32) follows
|
|
|
(8.34) |
and again
|
|
|
(8.35) |
The norm decreases in (for fixed ) and is
identically when , but in applications we will want to take
. The following duality lemma will then often prove useful. Recall that
are said to be (Hölder-)conjugate
exponents if
|
|
|
(8.36) |
Lemma 8.12
For any operator , let
|
|
|
denote its adjoint with respect to . Then, for any ,
|
|
|
In particular, for a reversible chain and any and ,
|
|
|
(8.37) |
Proof.
Classical duality for spaces (see, e.g., Chapter 6 in [303])
asserts that, given and on ,
|
|
|
where
|
|
|
Thus
|
|
|
|
|
|
|
|
|
|
and also
|
|
|
so
|
|
|
Since this is true for every , we conclude . Reverse roles to complete the proof.
As a corollary, if then (8.34) and (8.37) combine to
give
|
|
|
and then
|
|
|
from Lemma 8.11. Thus
|
|
|
(8.38) |
Here is a somewhat different derivation of (8.38):
Lemma 8.13
For ,
|
|
|
|
|
|
|
|
|
|
Proof.
In light of (8.31), (8.30), and Lemma 8.10(b),
we need only establish the first equality. Indeed, is the norm of the function and so equals
|
|
|
Taking the maximum over we obtain
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Choosing in Lemma 8.11 recaptures (8.26), and choosing in Lemma 8.13 likewise recaptures the
consequence (8.5)
of (8.26). The central theme for both Nash and log-Sobolev techniques
is that one can improve upon these results by more judicious choice of .
8.2.3 Exact computation of
The proof of Lemma 8.13 can also be used to show that
|
|
|
(8.39) |
as at (8.9). In those rare instances when the spectral
representation is
known explicitly, this gives the formula
xxx Also useful in conjunction with comparison method—see Section 3.
xxx If we can compute this, we can compute . But the
point is to test out Lemma 8.13.
|
|
|
(8.40) |
and the techniques of later sections are not needed to compute .
In particular, in the vertex-transitive case
|
|
|
The norm clearly behaves nicely under the formation of products:
|
|
|
(8.41) |
Example 8.14
The two-state chain and the -cube.
For the two-state chain, the results of Chapter 5 Example yyy:4 show
|
|
|
In particular, for the continuized walk on the -path,
|
|
|
By the extension of (8.41) to higher-order products, we therefore have
|
|
|
for the continuized walk on the -cube. This result is also easily derived
from the results of Chapter 5 Example yyy:15. For and , the optimal choice of in Lemma 8.13 is
therefore
|
|
|
and this leads in straightforward fashion to the bound
|
|
|
While this is a significant improvement on the bound [cf. (8.12)]
|
|
|
obtained by setting , i.e., obtained using only information
about , it is not
xxx REWRITE!, in light of corrections to my notes.
as sharp as the upper bound
|
|
|
in (8.13) that will be derived using log-Sobolev techniques.
For this graph, the results of Chapter 5 Example yyy:9 show
|
|
|
It turns out for this example that is the optimal choice in
Lemma 8.13. This is not surprising given the sharpness of the bound
in (8.7) in this case. See Example 8.32 below for
further details.
Example 8.16
Product random walk on a -dimensional grid.
Consider again the benchmark product chain (i.e., the “tilde chain”) in
Example 8.5. That chain has relaxation time
|
|
|
so choosing in Lemma 8.13 gives
|
|
|
|
|
(8.42) |
|
|
|
|
|
This bound can be improved using . Indeed, if we first consider
continuized random walk on the -path with self-loops added at each end, the
stationary distribution is uniform, the eigenvalues are
|
|
|
and the eigenvectors are given by
|
|
|
According to (8.40) and simple estimates, for
|
|
|
and
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
when is standard normal; in particular, we have used the well-known (xxx:
point to Ross book exercise) bound
|
|
|
Thus
|
|
|
Return now to the “tilde chain” of Example 8.5, and assume for
simplicity that . Since this chain is a slowed-down
-fold product of the path chain, it has
|
|
|
(8.43) |
In particular, since , it is now easy to see that
|
|
|
(8.44) |
for a universal constant .
xxx We’ve improved on (8.42), which gave order .
xxx When used to bound at (8.4), the bound (8.44) is
“right”: see Theorem 4.1, p. 481, in [118].
The optimal choice of in Lemma 8.13 cannot be obtained explicitly
when (8.43) is used to bound . For
this reason, and for later purposes, it is useful to use the simpler, but more
restricted, bound
|
|
|
(8.45) |
To verify this bound, simply notice that for . When ,
(8.45), and are used in Lemma 8.13, we
find
|
|
|
xxx Improvement over (8.42) by factor , but display
following (8.43) shows still off by factor .