8 Advanced L2L^{2} Techniques for Bounding Mixing Times (May 19 1999)

8.2 Improved bounds on L2L^{2} distance

The central theme of the remainder of this chapter is that norms other than the L1L^{1} norm (and closely related variation distance) and L2L^{2} norm can be used to improve substantially upon the bound

Pi(Xt)-π()2π-1/2e-t/τ2.\|P_{i}(X_{t}\in\cdot)-\pi(\cdot)\|_{2}\leq\pi^{-1/2}e^{-t/\tau_{2}}. (8.26)

8.2.1 LqL^{q} norms and operator norms

Our discussion here of LqL^{q} norms will parallel and extend the discussion in Chapter 2 Section yyy:6.2 of L1L^{1} and L2L^{2} norms. Given 1q1\leq q\leq\infty, both the LqL^{q} norm of a function and the LqL^{q} norm of a signed measure are defined with respect to some fixed reference probability distribution π\pi on II, which for our purposes will be the stationary distribution of some irreducible but not necessarily reversible chain under consideration. For 1q<1\leq q<\infty, the LqL^{q} norm of a function f:I𝐑f:I\to{\bf R} is

fq:=(iπi|f(i)|q)1/q,\|f\|_{q}:=\left(\sum_{i}\pi_{i}|f(i)|^{q}\right)^{1/q},

and we define the LqL^{q} norm of a signed measure ν\nu on II to be the LqL^{q} norm of its density function with respect to π\pi:

νq:=(jπj1-q|νj|q)1/q.\|\nu\|_{q}:=\left(\sum_{j}\pi^{1-q}_{j}\,|\nu_{j}|^{q}\right)^{1/q}.

For q=q=\infty, the corresponding definitions are

f:=maxi|f(i)|\|f\|_{\infty}:=\max_{i}|f(i)|

and

ν:=maxj(|νj|/πj).\|\nu\|_{\infty}:=\max_{j}(|\nu_{j}|/\pi_{j}).

Any matrix 𝐀:=(aij:i,jI){\bf A}:=(a_{ij}:i,j\in I) operates on functions f:I𝐑f:I\to{\bf R} by left-multiplication:

(𝐀f)(i)=jaijf(j),({\bf A}f)(i)=\sum_{j}a_{ij}f(j), (8.27)

and on signed measures ν\nu by right-multiplication:

(ν𝐀)j=iνiaij.(\nu{\bf A})_{j}=\sum_{i}\nu_{i}a_{ij}. (8.28)

For (8.27), fix 1q11\leq q_{1}\leq\infty and 1q21\leq q_{2}\leq\infty and regard 𝐀{\bf A} as a linear operator mapping Lq1L^{q_{1}} into Lq2L^{q_{2}}. The operator norm 𝐀q1q2\|{\bf A}\|_{q_{1}\to q_{2}} is defined by

𝐀q1q2:=sup{𝐀fq2:fq1=1}.\|{\bf A}\|_{q_{1}\to q_{2}}:=\sup\{\|{\bf A}f\|_{q_{2}}:\|f\|_{q_{1}}=1\}. (8.29)

The sup in (8.29) is always achieved, and there are many equivalent reexpressions, including

𝐀q1q2\displaystyle\|{\bf A}\|_{q_{1}\to q_{2}} =\displaystyle= max{𝐀fq2:fq11}\displaystyle\max\{\|{\bf A}f\|_{q_{2}}:\|f\|_{q_{1}}\leq 1\}
=\displaystyle= max{𝐀fq2/fq1:f0}.\displaystyle\max\{\|{\bf A}f\|_{q_{2}}/\|f\|_{q_{1}}:f\neq 0\}.

Note also that

𝐁𝐀q1q3𝐀q1q2𝐁q2q3,   1q1,q2,q3.\|{\bf B}{\bf A}\|_{q_{1}\to q_{3}}\leq\|{\bf A}\|_{q_{1}\to q_{2}}\|{\bf B}\|% _{q_{2}\to q_{3}},\ \ \ 1\leq q_{1},q_{2},q_{3}\leq\infty. (8.30)

For (8.28), we may similarly regard 𝐀{\bf A} as a linear operator mapping signed measures ν\nu, measured by νq1\|\nu\|_{q_{1}}, to signed measures ν𝐀\nu{\bf A}, measured by ν𝐀q2\|\nu{\bf A}\|_{q_{2}}. The corresponding definition of operator norm, call it ∥|𝐀∥|q1q2\||{\bf A}\||_{q_{1}\to q_{2}}, is then

∥|𝐀∥|q1q2:=sup{ν𝐀q2:νq1=1}.\||{\bf A}\||_{q_{1}\to q_{2}}:=\sup\{\|\nu{\bf A}\|_{q_{2}}:\|\nu\|_{q_{1}}=1\}.

A brief calculation shows that

∥|𝐀∥|q1q2=𝐀*q1q2,\||{\bf A}\||_{q_{1}\to q_{2}}=\|{\bf A}^{*}\|_{q_{1}\to q_{2}},

where 𝐀*{\bf A}^{*} is the matrix with (i,j)(i,j) entry πjaji/πi\pi_{j}a_{ji}/\pi_{i}, that is, 𝐀*{\bf A}^{*} is the adjoint operator to 𝐀{\bf A} (with respect to π\pi).

Our applications in this chapter will all have 𝐀=𝐀*{\bf A}={\bf A}^{*}, so we will not need to distinguish between the two operator norms. In fact, all our applications will take 𝐀{\bf A} to be either 𝐏t{\bf P}_{t} or 𝐏t-𝐄{\bf P}_{t}-{\bf E} for some t0t\geq 0, where

𝐏t:=(pij(t):i,jI){\bf P}_{t}:=(p_{ij}(t):i,j\in I)

xxx notation 𝐏t{\bf P}_{t} found elsewhere in book?

and 𝐄=limt𝐏t{\bf E}=\lim_{t\to\infty}{\bf P}_{t} is the transition matrix for the trivial discrete time chain that jumps in one step to stationarity:

𝐄=(πj:i,jI),{\bf E}=(\pi_{j}:i,j\in I),

and where we assume that the chain for (𝐏t)({\bf P}_{t}) is reversible. Note that 𝐄{\bf E} operates on functions essentially as expectation with respect to π\pi:

(𝐄f)(i)=jπjf(j),   iI.({\bf E}f)(i)=\sum_{j}\pi_{j}f(j),\ \ \ i\in I.

The effect of 𝐄{\bf E} on signed measures is to map ν\nu to (iνi)π(\sum_{i}\nu_{i})\pi, and

𝐏t𝐄=𝐄=𝐄𝐏t,   t0.{\bf P}_{t}{\bf E}={\bf E}={\bf E}{\bf P}_{t},\ \ \ t\geq 0. (8.31)

8.2.2 A more general bound on L2L^{2} distance

The following preliminary result, a close relative to Chapter 3, Lemmas yyy:21 and 23, is used frequently enough in the sequel that we isolate it for reference. It is the simple identity in part (b) that shows why L2L^{2}-based techniques are so useful.

Lemma 8.10

(a) For any function ff,

ddt𝐏tf22=-2(𝐏tf,𝐏tf)-2τ2varπ𝐏tf0.\frac{d}{dt}\|{\bf P}_{t}f\|^{2}_{2}=-2\mbox{${\cal E}$}({\bf P}_{t}f,{\bf P}_% {t}f)\leq-\frac{2}{\tau_{2}}{{\rm var}}_{\pi}\,{\bf P}_{t}f\leq 0.

(b)

𝐏t-𝐄22=e-t/τ2,   t0.\|{\bf P}_{t}-{\bf E}\|_{2\to 2}=e^{-t/\tau_{2}},\ \ \ t\geq 0.

Proof. (a) Using the backward equations

ddtpij(t)=kqikpkj(t)\frac{d}{dt}p_{ij}(t)=\sum_{k}q_{ik}\,p_{kj}(t)

we find

ddt(𝐏tf)(i)=kqik[(𝐏tf)(k)]\frac{d}{dt}({\bf P}_{t}f)(i)=\sum_{k}q_{ik}[({\bf P}_{t}f)(k)]

and so

ddt𝐏tf22\displaystyle\frac{d}{dt}\|{\bf P}_{t}f\|^{2}_{2} =\displaystyle= 2ikπi[(𝐏tf)(i)]qik[(𝐏tf)(k)]\displaystyle 2\sum_{i}\sum_{k}\pi_{i}[({\bf P}_{t}f)(i)]q_{ik}[({\bf P}_{t}f)% (k)]
=\displaystyle= -2(𝐏tf,𝐏tf)   by Chapter 3 yyy:(70)\displaystyle-2\mbox{${\cal E}$}({\bf P}_{t}f,{\bf P}_{t}f)\mbox{\ \ \ by % Chapter~{}3 yyy:(70)}
\displaystyle\leq -2τ2varπ𝐏tf   by the extremal characterization of τ2\tau_{2}.\displaystyle-\frac{2}{\tau_{2}}{{\rm var}}_{\pi}\,{\bf P}_{t}f\mbox{\ \ \ by % the extremal characterization of~{}$\tau_{2}$.}

(b) From (a), for any ff we have

ddt(𝐏t-𝐄)f22=ddt𝐏t(f-𝐄f)22-2τ2(𝐏t-𝐄)f22,\frac{d}{dt}\|({\bf P}_{t}-{\bf E})f\|^{2}_{2}=\frac{d}{dt}\|{\bf P}_{t}(f-{% \bf E}f)\|^{2}_{2}\leq-\frac{2}{\tau_{2}}\|({\bf P}_{t}-{\bf E})f\|^{2}_{2},

which yields

(𝐏t-𝐄)f22\displaystyle\|({\bf P}_{t}-{\bf E})f\|^{2}_{2} \displaystyle\leq (𝐏0-𝐄)f22e-2t/τ2=(varπf)e-2t/τ2\displaystyle\|({\bf P}_{0}-{\bf E})f\|^{2}_{2}\,e^{-2t/\tau_{2}}=({{\rm var}}% _{\pi}\,f)e^{-2t/\tau_{2}}
\displaystyle\leq f22e-2t/τ2.\displaystyle\|f\|^{2}_{2}\,e^{-2t/\tau_{2}}.

Thus 𝐏t-𝐄22e-t/τ2\|{\bf P}_{t}-{\bf E}\|_{2\to 2}\leq e^{-t/\tau_{2}}. Taking ff to be the eigenvector

fi:=πi-1/2ui2,   iI,f_{i}:=\pi^{-1/2}_{i}u_{i2},\ \ \ i\in I,

of 𝐏t-𝐄{\bf P}_{t}-{\bf E} corresponding to eigenvalue exp(-t/τ2)\exp(-t/\tau_{2}) demonstrates equality and completes the proof of (b).    

The key to all further developments in this chapter is the following result.

Lemma 8.11

For an irreducible reversible chain with arbitrary initial distribution and any s,t0s,t\geq 0,

P(Xs+t)-π()2P(Xs)2𝐏t-𝐄22=P(Xs)2e-t/τ2.\|P(X_{s+t}\in\cdot)-\pi(\cdot)\|_{2}\leq\|P(X_{s}\in\cdot)\|_{2}\|{\bf P}_{t}% -{\bf E}\|_{2\to 2}=\|P(X_{s}\in\cdot)\|_{2}\,e^{-t/\tau_{2}}.

Proof. The equality is Lemma 8.10(b), and

P(Xs+t)-π()2=P(Xs)(𝐏t-𝐄)2P(Xs)2𝐏t-𝐄22\|P(X_{s+t}\in\cdot)-\pi(\cdot)\|_{2}=\|P(X_{s}\in\cdot)({\bf P}_{t}-{\bf E})% \|_{2}\leq\|P(X_{s}\in\cdot)\|_{2}\|{\bf P}_{t}-{\bf E}\|_{2\to 2}

proves the inequality.    

We have already discussed, in Section 8.1, a technique for bounding τ2\tau_{2} when (as is usually the case) it cannot be computed exactly. To utilize Lemma 8.11, we must also bound P(Xs)2\|P(X_{s}\in\cdot)\|_{2}. Since

P(Xs)2=P(X0)𝐏s2\|P(X_{s}\in\cdot)\|_{2}=\|P(X_{0}\in\cdot){\bf P}_{s}\|_{2} (8.32)

xxx For NOTES: By Jensen’s inequality (for 1q<1\leq q<\infty), any transition matrix contracts LqL^{q} for any 1q1\leq q\leq\infty.

and each 𝐏t{\bf P}_{t} is contractive on L2L^{2}, i.e., 𝐏t221\|{\bf P}_{t}\|_{2\to 2}\leq 1 (this follows, for example, from Lemma 8.10(a); and note that 𝐏t22=1\|{\bf P}_{t}\|_{2\to 2}=1 by considering constant functions), it follows that

P(Xs)2 decreases monotonically to 11 as ss\uparrow\infty,\|P(X_{s}\in\cdot)\|_{2}\mbox{\ decreases monotonically to~{}$1$ as~{}$s% \uparrow\infty$,} (8.33)

and the decrease is strictly monotone unless P(X0)=π()P(X_{0}\in\cdot)=\pi(\cdot). From (8.32) follows

P(Xs)2P(X0)q*𝐏sq*2 for any 1q*1\leq q^{*}\leq\infty,\|P(X_{s}\in\cdot)\|_{2}\leq\|P(X_{0}\in\cdot)\|_{q^{*}}\|{\bf P}_{s}\|_{q^{*}% \to 2}\mbox{\ for any $1\leq q^{*}\leq\infty$,} (8.34)

and again

𝐏sq*2 decreases monotonically to 11 as ss\uparrow\infty.\|{\bf P}_{s}\|_{q^{*}\to 2}\mbox{\ decreases monotonically to~{}$1$ as~{}$s% \uparrow\infty$.} (8.35)

The norm 𝐏sq*2\|{\bf P}_{s}\|_{q^{*}\to 2} decreases in q*q^{*} (for fixed ss) and is identically 11 when q*2q^{*}\geq 2, but in applications we will want to take q*<2q^{*}<2. The following duality lemma will then often prove useful. Recall that 1q,q*1\leq q,q^{*}\leq\infty are said to be (Hölder-)conjugate exponents if

1q+1q*=1.\frac{1}{q}+\frac{1}{q^{*}}=1. (8.36)
Lemma 8.12

For any operator 𝐀{\bf A}, let

𝐀*=(πjaji/πi:i,jI){\bf A}^{*}=(\pi_{j}a_{ji}/\pi_{i}:i,j\in I)

denote its adjoint with respect to π\pi. Then, for any 1q1,q21\leq q_{1},q_{2}\leq\infty,

𝐀q1q2=𝐀*q2*q1*.\|{\bf A}\|_{q_{1}\to q_{2}}=\|{\bf A}^{*}\|_{q^{*}_{2}\to q^{*}_{1}}.

In particular, for a reversible chain and any 1q1\leq q\leq\infty and s0s\geq 0,

𝐏s2q=𝐏sq*2.\|{\bf P}_{s}\|_{2\to q}=\|{\bf P}_{s}\|_{q^{*}\to 2}. (8.37)

Proof. Classical duality for LqL^{q} spaces (see, e.g., Chapter 6 in [303]) asserts that, given 1q1\leq q\leq\infty and gg on II,

gq*=max{|f,g|:fq=1}\|g\|_{q^{*}}=\max\{|\langle f,g\rangle|:\|f\|_{q}=1\}

where

f,g:=iπif(i)g(i).\langle f,g\rangle:=\sum_{i}\pi_{i}f(i)g(i).

Thus

𝐀*gq1*\displaystyle\|{\bf A}^{*}g\|_{q^{*}_{1}} =\displaystyle= max{|f,𝐀*g|:fq1=1}\displaystyle\max\{|\langle f,{\bf A}^{*}g\rangle|:\|f\|_{q_{1}}=1\}
=\displaystyle= max{|𝐀f,g|:fq=1},\displaystyle\max\{|\langle{\bf A}f,g\rangle|:\|f\|_{q}=1\},

and also

|𝐀f,g|𝐀fq2gq2*𝐀q1q2fq1gq2*,|\langle{\bf A}f,g\rangle|\leq\|{\bf A}f\|_{q_{2}}\|g\|_{q^{*}_{2}}\leq\|{\bf A% }\|_{q_{1}\to q_{2}}\|f\|_{q_{1}}\|g\|_{q^{*}_{2}},

so

𝐀*gq1*𝐀q1q2gq2*.\|{\bf A}^{*}g\|_{q^{*}_{1}}\leq\|{\bf A}\|_{q_{1}\to q_{2}}\|g\|_{q^{*}_{2}}.

Since this is true for every gg, we conclude 𝐀*q2*q1*𝐀q1q2\|{\bf A}^{*}\|_{q^{*}_{2}\to q^{*}_{1}}\leq\|{\bf A}\|_{q_{1}\to q_{2}}. Reverse roles to complete the proof.    

As a corollary, if q*=1q^{*}=1 then (8.34) and (8.37) combine to give

P(Xs)2𝐏s12=𝐏s2\|P(X_{s}\in\cdot)\|_{2}\leq\|{\bf P}_{s}\|_{1\to 2}=\|{\bf P}_{s}\|_{2\to\infty}

and then

P(Xs+t)-π()2𝐏s2e-t/τ2\|P(X_{s+t}\in\cdot)-\pi(\cdot)\|_{2}\leq\|{\bf P}_{s}\|_{2\to\infty}\,e^{-t/% \tau_{2}}

from Lemma 8.11. Thus

d^(2(s+t))𝐏s2e-t/τ2.\sqrt{{\hat{d}}(2(s+t))}\leq\|{\bf P}_{s}\|_{2\to\infty}\,e^{-t/\tau_{2}}. (8.38)

Here is a somewhat different derivation of (8.38):

Lemma 8.13

For 0st0\leq s\leq t,

d^(2t)=𝐏t-𝐄2\displaystyle\sqrt{{\hat{d}}(2t)}=\|{\bf P}_{t}-{\bf E}\|_{2\to\infty} \displaystyle\leq 𝐏s2𝐏t-s-𝐄22\displaystyle\|{\bf P}_{s}\|_{2\to\infty}\|{\bf P}_{t-s}-{\bf E}\|_{2\to 2}
=\displaystyle= 𝐏s2e-(t-s)/τ2.\displaystyle\|{\bf P}_{s}\|_{2\to\infty}\,e^{-(t-s)/\tau_{2}}.

Proof. In light of (8.31), (8.30), and Lemma 8.10(b), we need only establish the first equality. Indeed, Pi(Xt)-π()2\|P_{i}(X_{t}\in\cdot)-\pi(\cdot)\|_{2} is the L2L^{2} norm of the function (Pi(Xt)/π())-1(P_{i}(X_{t}\in\cdot)/\pi(\cdot))-1 and so equals

max{|j(pij(t)-πj)f(j)|:f2=1}=max{|((𝐏t-𝐄)f)(i)|:f2=1}.\max\left\{\left|\sum_{j}(p_{ij}(t)-\pi_{j})f(j)\right|:\|f\|_{2}=1\right\}=% \max\{|(({\bf P}_{t}-{\bf E})f)(i)|:\|f\|_{2}=1\}.

Taking the maximum over iIi\in I we obtain

d^(2t)\displaystyle\sqrt{{\hat{d}}(2t)} =\displaystyle= max{maxi|((𝐏t-𝐄)f)(i)|:f2=1}\displaystyle\max\left\{\max_{i}|(({\bf P}_{t}-{\bf E})f)(i)|:\|f\|_{2}=1\right\}
=\displaystyle= max{(𝐏t-𝐄)f:f2=1}\displaystyle\max\{\|({\bf P}_{t}-{\bf E})f\|_{\infty}:\|f\|_{2}=1\}
=\displaystyle= 𝐏t-𝐄2.   \displaystyle\|{\bf P}_{t}-{\bf E}\|_{2\to\infty}.~{}\ \ \rule{4.3pt}{4.3pt}

Choosing s=0s=0 in Lemma 8.11 recaptures (8.26), and choosing s=0s=0 in Lemma 8.13 likewise recaptures the consequence (8.5) of (8.26). The central theme for both Nash and log-Sobolev techniques is that one can improve upon these results by more judicious choice of ss.

8.2.3 Exact computation of N(s)N(s)

The proof of Lemma 8.13 can also be used to show that

N(s):=𝐏s2=maxiPi(Xs)2=maxipii(2s)πi,N(s):=\|{\bf P}_{s}\|_{2\to\infty}=\max_{i}\|P_{i}(X_{s}\in\cdot)\|_{2}=\max_{% i}\sqrt{\frac{p_{ii}(2s)}{\pi_{i}}}, (8.39)

as at (8.9). In those rare instances when the spectral representation is known explicitly, this gives the formula

xxx Also useful in conjunction with comparison method—see Section 3.

xxx If we can compute this, we can compute d^(2t)=N2(t)-1{\hat{d}}(2t)=N^{2}(t)-1. But the point is to test out Lemma 8.13.

N2(s)=1+maxiπi-1m=2nuim2exp(-2λms),N^{2}(s)=1+\max_{i}\pi_{i}^{-1}\sum_{m=2}^{n}u^{2}_{im}\exp(-2\lambda_{m}s), (8.40)

and the techniques of later sections are not needed to compute N(s)N(s). In particular, in the vertex-transitive case

N2(s)=1+m=2nexp(-2λs).N^{2}(s)=1+\sum_{m=2}^{n}\exp(-2\lambda s).

The norm N(s)N(s) clearly behaves nicely under the formation of products:

N(s)=N(1)(s)N(2)(s).N(s)=N^{(1)}(s)N^{(2)}(s). (8.41)
Example 8.14

The two-state chain and the dd-cube.

For the two-state chain, the results of Chapter 5 Example yyy:4 show

N2(s)=1+max(p,q)min(p,q)e-2(p+q)s.N^{2}(s)=1+\frac{\max(p,q)}{\min(p,q)}e^{-2(p+q)s}.

In particular, for the continuized walk on the 22-path,

N2(s)=1+e-4s.N^{2}(s)=1+e^{-4s}.

By the extension of (8.41) to higher-order products, we therefore have

N2(s)=(1+e-4s/d)dN^{2}(s)=(1+e^{-4s/d})^{d}

for the continuized walk on the dd-cube. This result is also easily derived from the results of Chapter 5 Example yyy:15. For d2d\geq 2 and t14dlog(d-1)t\geq\frac{1}{4}d\log(d-1), the optimal choice of ss in Lemma 8.13 is therefore

s=14dlog(d-1)s={\textstyle\frac{1}{4}}d\log(d-1)

and this leads in straightforward fashion to the bound

τ^14d(logd+3).{\hat{\tau}}\leq{\textstyle\frac{1}{4}}d(\log d+3).

While this is a significant improvement on the bound [cf. (8.12)]

τ^14(log2)d2+12d{\hat{\tau}}\leq{\textstyle\frac{1}{4}}(\log 2)d^{2}+{\textstyle\frac{1}{2}}d

obtained by setting s=0s=0, i.e., obtained using only information about τ2\tau_{2}, it is not

xxx REWRITE!, in light of corrections to my notes.

as sharp as the upper bound

τ^(1+o(1))14dlogd{\hat{\tau}}\leq(1+o(1)){\textstyle\frac{1}{4}}d\log d

in (8.13) that will be derived using log-Sobolev techniques.

Example 8.15

The complete graph.

For this graph, the results of Chapter 5 Example yyy:9 show

N2(s)=1+(n-1)exp(-2nsn-1).N^{2}(s)=1+(n-1)\exp\left(-\frac{2ns}{n-1}\right).

It turns out for this example that s=0s=0 is the optimal choice in Lemma 8.13. This is not surprising given the sharpness of the bound in (8.7) in this case. See Example 8.32 below for further details.

Example 8.16

Product random walk on a dd-dimensional grid.

Consider again the benchmark product chain (i.e., the “tilde chain”) in Example 8.5. That chain has relaxation time

τ2=d(1-cos(πm))-112dm2,\tau_{2}=d\left(1-\cos\left(\frac{\pi}{m}\right)\right)^{-1}\leq\frac{1}{2}dm^% {2},

so choosing s=0s=0 in Lemma 8.13 gives

τ^\displaystyle{\hat{\tau}} \displaystyle\leq d1-cos(π/m)(12logn+1)\displaystyle\frac{d}{1-\cos(\pi/m)}\left(\frac{1}{2}\log n+1\right) (8.42)
\displaystyle\leq 14dm2(logn+2).\displaystyle{\textstyle\frac{1}{4}}dm^{2}(\log n+2).

This bound can be improved using N()N(\cdot). Indeed, if we first consider continuized random walk on the mm-path with self-loops added at each end, the stationary distribution is uniform, the eigenvalues are

λl=1-cos(π(l-1)/m),   1lm,\lambda_{l}=1-\cos(\pi(l-1)/m),\ \ \ 1\leq l\leq m,

and the eigenvectors are given by

uil=(2/m)1/2cos(π(l-1)(i-12)/m),   0im-1,   2lm.u_{il}=(2/m)^{1/2}\cos(\pi(l-1)(i-{\textstyle\frac{1}{2}})/m),\ \ \ 0\leq i% \leq m-1,\ \ \ 2\leq l\leq m.

According to (8.40) and simple estimates, for s>0s>0

N2(s)-12l=2mexp[-2s(1-cos(π(l-1)/m)]2l=1m-1exp(-4sl2/m2)N^{2}(s)-1\leq 2\sum_{l=2}^{m}\exp[-2s(1-\cos(\pi(l-1)/m)]\leq 2\sum_{l=1}^{m-% 1}\exp(-4sl^{2}/m^{2})

and

l=2m-1exp(-4sl2/m2)\displaystyle\sum_{l=2}^{m-1}\exp(-4sl^{2}/m^{2}) \displaystyle\leq x=1exp(-4sx2/m2)dx\displaystyle\int_{x=1}^{\infty}\exp(-4sx^{2}/m^{2})\,dx
=\displaystyle= m(π4s)1/2P(Z2(2s)1/2m)\displaystyle m\left(\frac{\pi}{4s}\right)^{1/2}P\left(Z\geq\frac{2(2s)^{1/2}}% {m}\right)
\displaystyle\leq (π/44s/m2)1/2exp(-4s/m2)\displaystyle\left(\frac{\pi/4}{4s/m^{2}}\right)^{1/2}\exp(-4s/m^{2})
\displaystyle\leq (4s/m2)-1/2exp(-4s/m2)\displaystyle(4s/m^{2})^{-1/2}\exp(-4s/m^{2})

when ZZ is standard normal; in particular, we have used the well-known (xxx: point to Ross book exercise) bound

P(Zz)12e-z2/2,z0.P(Z\geq z)\leq{\textstyle\frac{1}{2}}e^{-z^{2}/2},\ \ \ z\geq 0.

Thus

N2(s)1+2[1+(4s/m2)-1/2]exp(-4s/m2),   s>0.N^{2}(s)\leq 1+2[1+(4s/m^{2})^{-1/2}]\exp(-4s/m^{2}),\ \ \ s>0.

Return now to the “tilde chain” of Example 8.5, and assume for simplicity that m1==md=mm_{1}=\cdots=m_{d}=m. Since this chain is a slowed-down dd-fold product of the path chain, it has

N2(s)[1+2(1+(4sdm2)-1/2)exp(-4sdm2)]d,   s>0.N^{2}(s)\leq\left[1+2\left(1+\left(\frac{4s}{dm^{2}}\right)^{-1/2}\right)\exp% \left(-\frac{4s}{dm^{2}}\right)\right]^{d},\ \ \ s>0. (8.43)

In particular, since d^(2t)=N2(t)-1{\hat{d}}(2t)=N^{2}(t)-1, it is now easy to see that

τ^Km2dlogd=Kn2/ddlogd{\hat{\tau}}\leq Km^{2}d\log d=Kn^{2/d}d\log d (8.44)

for a universal constant KK.

xxx We’ve improved on (8.42), which gave order m2d2logmm^{2}d^{2}\log m.

xxx When used to bound d(t)d(t) at (8.4), the bound (8.44) is “right”: see Theorem 4.1, p. 481, in [118].

The optimal choice of ss in Lemma 8.13 cannot be obtained explicitly when (8.43) is used to bound N(s)=𝐏s2N(s)=\|{\bf P}_{s}\|_{2\to\infty}. For this reason, and for later purposes, it is useful to use the simpler, but more restricted, bound

N2(s)(4dm2/s)d/2   for 0sdm2/160\leq s\leq dm^{2}/16.N^{2}(s)\leq(4dm^{2}/s)^{d/2}\mbox{\ \ \ for $0\leq s\leq dm^{2}/16$.} (8.45)

To verify this bound, simply notice that u+2ue-u2+2e-u24u+2ue^{-u^{2}}+2e^{-u^{2}}\leq 4 for 0u120\leq u\leq\frac{1}{2}. When s=dm2/16s=dm^{2}/16, (8.45), and τ2dm2/2\tau_{2}\leq dm^{2}/2 are used in Lemma 8.13, we find

τ^34m2d2(log2+34d-1).{\hat{\tau}}\leq{\textstyle\frac{3}{4}}m^{2}d^{2}(\log 2+{\textstyle\frac{3}{4% }}d^{-1}).

xxx Improvement over (8.42) by factor Θ(logm)\Theta(\log m), but display following (8.43) shows still off by factor Θ(d/logd)\Theta(d/\log d).