8.4.1 The log-Sobolev time
Given a probability distribution on a finite set , define
xxx For NOTES: Persi’s is double ours.
|
|
|
(8.52) |
for , recalling and using
the convention . By Jensen’s inequality,
|
, with equality if and only if is constant. |
|
Given a finite, irreducible, reversible Markov chain with stationary
distribution , define the logarithmic Sobolev (or log-Sobolev) time by
xxx For NOTES: Persi’s is .
xxx Note . (Show?) See also Corollary 8.27.
|
|
|
(8.53) |
Notice the similarity between (8.53) and the extremal
characterization of (Chapter 3, Theorem yyy:22):
|
|
|
We discuss exact computation of in Section 8.4.3, the
behavior of for product chains in Section 8.4.4, and a
comparison method for bounding in Section 8.4.5. In
Section 8.4.2 we focus on the connection between and
mixing
times. A first such result asserts that the relaxation time does not
exceed the
log-Sobolev time:
Lemma 8.22
.
xxx Remarks about how “usually” equality?
xxx For NOTES: Proof from [302], via [117].
Proof.
Given and , let . Then, writing , and with all
asymptotics as ,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Also,
|
|
|
thus
|
|
|
and so
|
|
|
Furthermore, ; therefore
|
|
|
Finish by letting and then taking the supremum over .
8.4.2 , mixing times, and hypercontractivity
In this subsection we discuss the connection between the threshold time
parameter
|
|
|
(8.54) |
and the log-Sobolev time . As in Section 8.3, we again
consider the fundamental quantity
|
|
|
arising in the bound on in Lemma 8.13, and recall
from Section 8.3.1 that
|
decreases strictly monotonically from at
to as . |
|
The function is continuous. It would be nice (especially for use in
conjunction with the comparison technique) if we could characterize, in
terms of
the Dirichlet form , the value of , call it , such that
equals (say), but such a characterization is not presently available.
xxx For NOTES?: A partial result is Theorem 3.9 in [117], taking .
Open Problem 8.23
Characterize in terms of .
To carry on along these general lines, it turns out to be somewhat more
convenient to substitute use of
|
|
|
(8.55) |
an immediate consequence of Lemmas 8.11 and 8.12
and (8.34), for use of Lemma 8.13. The reason is that,
like , decreases monotonically to as ; but, unlike , it turns out that
|
for each , equals
for all sufficiently large . |
|
(8.56) |
The property (8.56) is called hypercontractivity, in light of
the facts that, for fixed , is a contraction on and is increasing in . Let
|
|
|
then , and we will see presently that for . The following theorem affords a connections with the log-Sobolev
time (and hence with the Dirichlet form ).
Theorem 8.24
For any finite, irreducible, reversible chain,
|
|
|
Proof.
The theorem is equivalently rephrased as follows:
|
for all and
satisfying |
|
(8.57) |
if and only if . The proof will make use of the generalization
|
|
|
of (8.52). Fixing and , we will also
employ the notation
|
|
|
|
|
(8.58) |
|
|
|
|
|
for .
As a preliminary, we compute the derivative of . To begin, we can proceed
as at the start of the proof of Lemma 8.10(a) to derive
|
|
|
Then
|
|
|
|
|
(8.59) |
|
|
|
|
|
For the first half of the proof we suppose that (8.57) holds and
must prove , that is, we must establish the log-Sobolev
inequality
|
for every . |
|
(8.60) |
To establish (8.60) it is enough to consider ,
xxx Do we actually use here?
since for arbitrary we have
|
and . |
|
(8.61) |
Plugging the specific formula (8.58) for into (8.59)
and setting gives
|
|
|
(8.62) |
Moreover, since
|
|
|
|
|
|
|
|
|
|
the (right-hand) derivative of at must be nonpositive. The
inequality (8.60) now follows from (8.62).
For the second half of the proof, we may assume and must
establish (8.57). For , (8.53) and
Lemma 8.25 (to follow) give
|
|
|
(8.63) |
for any . With , we have
, and replacing by in (8.63)
we obtain
|
|
|
From (8.59) we then find for all . Since , this implies
|
|
|
(8.64) |
We have assumed , but (8.64) now extends trivially to
general , and therefore
|
|
|
This gives the desired hypercontractivity assertion (8.57).
Here is the technical Dirichlet form lemma that was used in the proof of
Theorem 8.24.
Lemma 8.25
for and .
xxx Do we somewhere have the following?:
|
|
|
(8.65) |
Proof.
For any
|
|
|
|
|
|
|
|
|
|
This shows that
|
|
|
and the lemma follows easily from this and (8.65).
Now we are prepared to bound in terms of .
Theorem 8.26
(a) If , then for any state with ,
|
for . |
|
(b)
|
|
|
Proof.
Part (b) follows immediately from (8.54), part (a), and
Lemma 8.22. To prove part (a), we begin with (8.55):
|
|
|
As in the second half of the proof of Theorem 8.24, let . Then . Thus
|
|
|
Choosing we have and thus
|
for . |
|
We have established the upper bound in the following corollary; for the lower
bound, see Corollary 3.11 in [117].
Corollary 8.27
|
|
|
Examples illustrating the improvement Corollary 8.27 affords over the
similar result (8.7) in terms of are offered in
Examples 8.37 and 8.40.
8.4.3 Exact computation of
Exact computation of is exceptionally difficult—so difficult, in
fact, that is known only for a handful of examples. We present
some of
these examples in this subsection.
Example 8.28
Trivial two-state chains.
We consider a discrete-time chain on that jumps in one step to
stationarity (or, since the value of is unaffected by continuization,
the corresponding continuized chain). Thus with . We
also assume . The claim is that
|
|
|
(8.66) |
Note that this is continuous and decreasing for .
To prove (8.66), we need to show that for every nonconstant on , where
denotes the righthand side of (8.66), with equality for some .
First suppose . For the inequality, as at
(8.60)–(8.61) we may suppose and, by homogeneity,
|
|
|
We will work in terms of the single variable
|
|
|
so that
|
|
|
and we must consider . We
calculate
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
¿From here, a straightforward but very tedious calculus exercise shows that
decreases over , with , and that is strictly unimodal over , with
and . It follows that
is maximized over by
taking
to be the unique root to
|
|
|
|
|
(8.67) |
|
|
|
|
|
over .
There is on hope for solving (8.67) explicitly unless
|
|
|
i.e., . Fortunately, this is a solution to (8.67), and it falls in . The
corresponding value of is , so (8.66) follows, and we learn furthermore
that the function maximizing is , with and .
For , the major change is that now is increasing,
rather than unimodal, over . Thus , and (8.66) again follows.
Now consider any irreducible chain (automatically reversible) on
, with stationary distribution . Without loss of generality we
may suppose . We claim that
|
|
|
The proof is easy. The functional depends only on and so is
unchanged from Example 8.28, and the Dirichlet form changes from
in Example 8.28 to here.
Remark.
Recall from Chapter 5, Example yyy:4 that . It follows that
|
|
|
is a continuous and decreasing function of . In particular, we have
equality in Lemma 8.22 for a two-state chain if and only if . Moreover,
|
|
|
The proof of Lemma 8.22 and the result of
Example 8.28 can be combined to prove the following result: For
the “trivial” chain with , the log-Sobolev
time is given (when ) by
|
|
|
We omit the details, referring the reader to Theorem 5.1 of [117].
As an immediate corollary, we get a reverse-inequality complement to
Lemma 8.22:
Corollary 8.31
For any reversible chain (with , which is automatic for ),
|
|
|
Proof.
The result of Example 8.30 can be written
|
|
|
and by the extremal
characterization
of .
It follows readily from Example 8.30 that the continuized walk of
the complete graph has
|
|
|
Since , equality holds in Corollary 8.31 for
this example.
xxx Move the following warning to follow Corollary 8.27, perhaps?
Warning. Although the ratio of the upper bound on to lower bound in
Corollary 8.27 is smaller than that in (8.7), the
upper bound in Corollary 8.27 is sometimes of larger order of
magnitude
than the upper bound in (8.7). For the complete graph,
(8.7) says
|
|
|
and Corollary 8.27 yields
|
|
|
while, from Chapter 5, yyy:(33) it follows that
|
|
|
As another example, the product chain development in the next subsection
together with Example 8.29 will give exactly for the
-cube. On the other hand, the exact value of is unknown even for
many of the simplest examples in Chapter 5. For instance,
Open Problem 8.33
Calculate for the -cycle (Chapter 5 Example yyy:7) when .
xxx For NOTES: is complete graph , covered by
Example 8.32. ( for .)
Notwithstanding Open Problem 8.33, the value of is known
up to multiplicative constants. Indeed, it is shown in Section 4.2
in [117] that
|
|
|
Here is a similar result we will find useful later in dealing with our running
example of the grid.
Example 8.34
The -path with end self-loops.
For this example, discussed above in Example 8.16, we claim
|
|
|
The lower bound is easy, using Lemma 8.22:
|
|
|
For the upper bound we use Corollary 8.27 and estimation
of . Indeed, in Example 8.16 it was shown that
|
|
|
Substituting gives , so .
xxx P.S. Persi (98/07/02) points out that H. T. Yau showed for random transpositions by combining
(Lemma 8.22) and with . I have written notes generalizing and discussing this
and will incorporate them into a later version.
8.4.4 and product chains
xxx Remind reader of definition of product chain in continuous time given in
Chapter 4 Section yyy:6.2.
xxx Motivate study as providing benchmark chains for comparison method.
xxx Recall from Chapter 4, yyy:(42):
|
|
|
(8.68) |
xxx Product chain has transition rates equal (off diagonal) to
|
|
|
(8.69) |
xxx Dirichlet form works out very nicely for products:
Lemma 8.35
|
|
|
Proof.
This follows easily from (8.69) and the definition of in
Chapter 3 Section yyy:6.1 (cf. (68)).
The analogue of (8.68) for the log-Sobolev time is also true:
xxx For NOTES?: Can give analagous proof of (8.68): see my notes,
page 8.4.24A.
Theorem 8.36
For a continuous-time product chain,
|
|
|
Proof.
The keys to the proof are Lemma 8.35 and the following “law of
total -functional.” Given a function on the product
state space , define a function on
by
|
|
|
Then
|
|
|
|
|
|
|
|
|
|
where we have used
|
|
|
Thus, using the extremal characterization (definition) (8.53)
of and ,
|
|
|
(8.70) |
But from
|
|
|
follows
|
|
|
(8.71) |
From (8.70), (8.71), Lemma 8.35, and the extremal
characterization of we conclude . Testing on functions that depend only on one of the two
variables shows that .
Theorem 8.36 extends in the obvious fashion to
higher-dimensional
products.
The continuized walk on the -cube (Chapter 5, Example yyy:15) is
simply the product of copies of the continuized walk on the -path, each
run at rate . Therefore, since the log-Sobolev time for the -path
equals by Example 8.29, the corresponding time for the
-cube is
|
|
|
¿From this and the upper bound in Corollary 8.27 we can deduce
|
|
|
As discussed in this chapter’s introduction, this bound is remarkably sharp
and improves significantly upon the analogous bound that uses only knowledge
of . xxx Recall corrections marked on pages 8.2.11–12 of my
notes.
8.4.5 The comparison method for bounding
In Section 8.1 we compared relaxation times for two chains by
using the extremal characterization and comparing Dirichlet forms and
variances. For comparing variances, we used the characterization
|
|
|
To extend the comparison method to log-Sobolev times, we need the following
similar characterization of .
xxx For NOTES: Cite [181].
Lemma 8.38
The functional in (8.52) satisfies
|
|
|
(8.72) |
with
|
|
|
(8.73) |
Proof.
We compute
|
|
|
|
|
|
|
|
|
|
Thus is strictly convex and minimized by the choice ,
and so
|
|
|
This proves (8.72). Finally, applying the inequality
|
|
|
to and gives the inequality in (8.73).
Now it’s easy to see how to compare log-Sobolev times, since, adopting the
notation of Section 8.1, Lemma 8.38 immediately yields the
analogue
|
|
|
of (8.18). In the notation of Corollary 8.2, we
therefore have
Corollary 8.39 (comparison of log-Sobolev times)
|
|
|
Example 8.40
Random walk on a -dimensional grid.
xxx Remarked in Example 8.34 that for -path
with end self-loops.
xxx So by Theorem 8.36, benchmark product chain has .
Recalling and from Example 8.5, we therefore
find
|
|
|
(8.74) |
for random walk on the grid. Then Theorem 8.26(b) gives
|
|
|
which is of order . This is an improvement
on the
-only bound of (8.50) and may be compared
with the Nash-based bound of (8.51). In
Example 8.43 we will combine Nash-inequality and log-Sobolev
techniques
to get a bound of order