This is a chatty discussion of my older research, intended to be understandable to a Ph. D. student in theoretical or applied probability. Numbers like [22] refer to the bibliography and are accompanied by Math Reviews links. Instead of summarizing the results (for which see Math Reviews or the actual papers) I focus on context: where did the problem come from, and what subsequent work has been done?

Rare Events and the Poisson Clumping Heuristic

**NEW (2014)*** The book is back in print. This link to the Springer site gives some options; from a University, the "read this book on SpringerLink" tab is the best deal. An old PDF version can be found here Probability Approximations via the Poisson Clumping Heuristic but don't tell Springer.

This idiosyncratic 1989 book Probability Approximations via the Poisson Clumping Heuristic MR 90k:60004 developes a method for writing down first-order approximations in a wide range of "generalized extrema" settings, and illustrates with 100 examples from areas like Markov chain hitting times, extrema of stationary processes, combinatorial maxima, stochastic geometry coverage problems, multidimensional diffusions escaping from potential wells, maxima of Gaussian random fields. Here is a two page account from Encyclopedia of Statistical Science. Here is a 1992 update keyed to the book.

Feingold MR 94j:60138 has a nice application to Markov chain hitting times arising in genetics. A recent preprint by Adler On Excursion Sets, Tube Formulae, and Maxima of Random Fields relates the "maxima of random fields" chapter to subsequent developments of Worsley MR 96b:60132 and others.


Several people (Dacunha-Castelle, Kingman, Figiel and Sucheston) in the mid-1970s observed that, from any tight sequence of random variables, one can find a subsequence which (in some sense) resembles an exchangeable sequence. In a lecture at Oxford in Spring 1975, Kingman suggested this might provide a route to general versions of Chatterji's subsequence principle MR 49:9916 which asserts that from general sequences one can extract a subsequence satisfying analogs of classical limit theorems for i.i.d. sequences. Such an argument (which seems surprisingly subtle) was carried out as part of my thesis and published in 1977 as [2] MR 56:13330. Some further developments were made in 1986 by Berkes and Peter MR 87m:60083. My original interest in the topic, following Dacunha-Castelle and Krivine MR 58:30117, was to prove a conjecture in Banch space theory, that every subspace of $L^1$ contains a subspace isomorphic to some $l_p$. This was proved in [10] MR 83b:46025. Krivine and Maurey MR 83a:46030 quickly found a simpler and more general reformulation applying to stable Banach spaces. See the 1992 monograph by Guerre-Delabriere MR 94d:46013 for subsequent developments.

Consider an infinite array $(X_{ij}, 1 \leq i,j < \infty)$ of random variables of the form

(*) $X_{ij} = f(A_i,B_j,C_{ij})$

for i.i.d. $(A_i)$, $(B_j)$ and $(C_{ij})$. The array $X = (X_{ij})$ has the row and column exchangeability (RCE) property. It is natural (e.g. as a general Bayesian analog of classical statistical analysis of variance) to seek a converse: is every RCE array essentially of the form (*)? This is proved in [11] MR 82m:60022. A different proof based on techniques from logic was given independently by Hoover MR 84b:60016. Here is Hoover's original technical report. At the same time Diaconis and Freedman MR 83e:92071 were thinking about the same question in connection with human visual pattern recognition. My monograph-length survey [22] MR 88d:60107 of probabilistic aspects of exchangeability gave the state of the art in 1983. Characterization theorems were subsequently developed in several directions in a sequence of papers by Kallenberg: see e.g. MR 96f:60063.

Update (2007). Kallenberg's 2005 monograph Probabilistic Symmetries and Invariance Principles elegantly covers the field. The theory of RCE arrays has been rediscovered at last twice since 2000. By Lovasz and co-workers in the context of limits of dense graphs: see e.g. their Limits of dense graph sequences and also see Diaconis-Janson Graph Limits and Exchangeable Random Graphs for the precise connection. And by Vershik Random Metric Spaces and Universality in the context of isometry classes of metric spaces with probability measures.

Weak Convergence (classical aspects)

During the 1970s, two very popular but rarely-connected topics were the theory of weak convergence popularized by Billingsley MR 38:1718 and the general theory of processes of Dellacherie and Meyer MR 85e:60001. Part of my thesis, published in 1978 as [4] MR 57:14086, gave a sufficient condition for tightness in terms of increments after stopping times. This fits nicely with the martingale characterization approach to weak convergence. Similar results and extensions were given by Harlamov MR 55:6600 Rebolledo MR 81g:60002 Jacod-Memin-Metevier MR 84i:60045 and are now textbook material: Jacod and Shiryayev MR 89k:60044, Ethier and Kurtz MR 88a:60130. My own extension [38] MR 90f:60002 is that, given convergence of finite-dimensional distributions to a continuous-path limit process, the only way weak convergence can fail is if there are predictable oscillations of predictable sign.

Around 1979 I circulated drafts of a monograph "Weak Convergence and the General Theory of Processes" but this was never completed.

Mixing Times for Finite Markov Chains

Elementary theory of finite-state irreducible chains asserts that as time $t \to \infty$ the time-$t$ distribution converges to the stationary distribution. To quantify this there are several variants of mixing time, formalizing the idea: how large must $t$ be in order for the time-$t$ distribution to be approximately the stationary distribution? Then given a sequence of chains with some "size" parameter $n$ (e.g. shuffling schemes for $n$-card decks) one can study how the mixing time grows with $n$. Theory asserting equivalence of several definitions and relations to properties of hitting times was initiated in [13] MR 83f:60098. Examples of the use of coupling to estimate mixing times were given in [21] MR 86j:60156, showing in particular that riffle shuffle of a $n$-card deck had mixing time $3/2 \log_2 n$ and introducing the notion of rapid mixing to mean that mixing time is poly-log (number of states). Together with the Diaconis-Shahshahani MR 82h:60024 analysis of card-shuffling via random transpositions, these provided a conceptual basis for mathematical study of mixing times for finite Markov chains and random walks on finite groups. Subsequently Jerrum-Sinclair MR 91g:68084 showed that certain approximate counting problems in the theory of algorithms could be solved using Markov chains, the key issue being analysis of their mixing time: this led to interest in mixing times in the theory of algorithms -- see e.g. the text Motwani - Raghavan MR 96i:65003. In the early 1990s the field of Markov Chain Monte Carlo (MCMC) emerged, in which the central theoretical issue in assessing performance is estimating mixing times in complicated data-dependent chains (this can hardly ever be done rigorously). MCMC is now the topic of several books MR 97d:62006 MR 98b:60121 and a preprint server.

Since 1989 I (and now Jim Fill) have been working on a monograph Reversible Markov Chains and Random Walks on Graphs which may be completed next millennium!

Finite and Countable Random Trees

Perhaps the most frequently rediscovered result in probability is the Markov Chain Tree Theorem giving an expression for the stationary distribution of a finite-state chain in terms of weighted spanning trees. Following conversations with Persi Diaconis we observed [43] MR 91h:60013 (and independently, Broder) that the theorem could be used to provide an algorithm for simulating a uniform random spanning tree of an undirected graph. Further one can use the construction to analyze [43] various properties of such random spanning trees; this was arguably the first non-trivial application of the Markov Chain Tree Theorem. For subsequent uses see the forthcoming monograph of Lyons and Peres .

Call a countable rooted tree with a single path from the root to infinity a tree with one end or a sin-tree. The simplest occurrence of random sin-trees is as critical Galton-Watson processes conditioned to be infinite, e.g. Kesten MR 88b:60232. My 1991 "general abstract nonsense" paper [52] MR 92j:60009 considers families of $n$-vertex rooted trees and supposes that as $n \to \infty$ the subtrees seen from random vertices converge to some limit random tree. Such a limit has a kind of "one-sided stationarity" property, and therefore (by analogy with stationary sequences) specifies a "two-sided" structure, which is a random sin-tree with a certain stationarity property. [52] gives many examples plus structure theory. The particular case of the (random graph) mean-field analog of the minimal spanning tree was studied in [49] MR 93a:05111. Later Penrose MR 97i:60014 showed the same tree arises as the high-dimensional limit of Euclidean minimal spanning trees.

Random sin-trees have subsequently appeared in many other contexts, e.g. as components of uniform spanning forests on countable graphs (see preprint by Benjamini et al). Two of my own recent papers involve sin-trees: Paper [79] discusses pruning processes on conditioned Galton-Watson trees, and the limit of the Erdos-Renyi random graph process as seen from a specified vertex; Paper [89] discusses the percolation process on a tree where infinite clusters are forbidden to grow further.