The Continuum Random Tree

Note: written around 1999 and not updated since then.

This is a chatty discussion of my research on this topic, intended to be understandable to a Ph. D. student in theoretical or applied probability. Numbers like [55] refer to the bibliography and are accompanied by Math Reviews links, and paper gives you the paper in compressed Postscript. Instead of summarizing the results (for which see Math Reviews or the actual papers) I focus on context: where did the problem come from, and what subsequent work has been done?

I started thinking about these matters in the late 1980s. A three-lecture series at a 1990 Durham workshop became the survey paper [55] MR 93f:60010 based on the two technical papers [46,56] MR 91i:60024 MR 94c:60015. The first section below summarizes the state of the art at that time; later sections describe subsequent work on continnum trees and parallel work in superprocesses.

The Portmanteau Theorem

A finite tree has (say) $n$ vertices and $n-1$ edges, with a unique path between any two vertices. Regarding the edges as having length $1$, we get a distance $d(v,w)$ between vertices $v$ and $w$. Assigning "mass" $1/n$ to each vertex gives a "mass measure" on the vertices with total mass $1$. A continuum tree is a metric space with the "tree" property that between any two points $v$ and $w$ there is a unique path. This implicitly specifies a length measure $\lambda$ such that $d(v,w)$ is the length of the set of points on the path from $v$ to $w$; we also suppose a continuum tree is equipped with as mass measure $\mu$ of total mass $1$, representing a method for picking a vertex at random. We now consider a random continuum tree -- which I call a continuum random tree or CRT because it sounds better! It is not obvious that there is any natural probability law on continuum trees, but the key fact is that there exists a Brownian CRT whose law is specified in any of the following four equivalent ways.

Law of spanning subtrees

Consider the space of trees with $k$ leaves labeled $1,2,\ldots,k$ and with unlabeled internal vertices, and where the $2k-3$ edge-lengths are real numbers. A probability law for a random tree in this space can be defined by:

(*) for each possible topological shape, the chance that the tree has that particular shape and that the vector of edge-lengths $(L_1,\ldots,L_{2k-3})$ is in $([l_i, l_i + dl_i], 1 \leq i \leq 2k-3)$ equals $s \exp(-s^2/2) dl_1 \ldots dl_{2k-3}$, where $s = \sum_i l_i$.

Take a realization of the Brownian CRT, then pick $k$ i.i.d. vertices from the mass measure, and consider the spanning subtree on these $k$ vertices. The unconditional law of this subtree is the law (*).

Construction from Brownian excursion

Consider an excursion-type function $f:[0,1] \to [0,\infty)$ with $f(0) = f(1) = 0$ and $f(x)>0$ elsewhere. Use $f$ to define a continuum tree as follows. Define a pseudo-metric on $[0,1]$ by: $d(x,y) = f(x) + f(y) - 2 \min(f(u): x \leq u \leq y), x\leq y$. The continuum tree is the associated metric space, and the mass measure is the image of Lebesgue measure on $[0,1]$. Using this construction with standard Brownian excursion gives the Brownian CRT.

Line-breaking construction

Cut the line $[0,\infty)$ into finite segments at the points of a non-homogeneous Poisson process of intensity $\lambda(x) = x$. Build a tree by inductively attaching a segment $[x_i,x_{i+1}]$ to a uniform random point of the tree built from the earlier segments. The tree built from the first $k-1$ segments has law (*). The metric space closure of the tree built from the whole half-line is the Brownian CRT, where the mass measure is the a.s. weak limit of the empirical law of the first $k$ cut-points.

Weak limit of conditioned critical Galton-Watson branching processes

Take a critical Galton-Watson branching process where the offspring law has finite non-zero variance, and condition on total population until extinction being $n$. This gives a random tree. Rescale edge-lengths to have length $n^{-1/2}$. Put mass $1/n$ on each vertex. In a certain sense that can be formalized, the $n \to \infty$ weak limit of these random trees is the Brownian CRT (up to a scaling factor).

Subsequent developments

Fractal/self-similarity aspects

Just as a sample path of $d$-dimensional Brownian motion has dimension $2$, so a realization of the Brownian CRT has dimension $2$. One can think of the Brownian CRT as providing a tractable special model of random (approximately) self-similar structures, and several papers deal with different aspects of this. In [67] MR 95i:60007 I elaborate the statistical self-similarity construction obtained by decomposing at the median into three subtrees, analogous to general recursive constructions of random fractals (Graf et al MR 88k:28010). In [62] MR 95b:60011 the combinatorial bijection between trees and triangulations of a $n$-gon is used to show there is a well-defined notion of uniform random triangulation of the circle. Krebs MR 96b:60124 formalizes informal remarks in [55] by constructing a diffusion ("Brownian motion") whose state space is the Brownian CRT, analogous to well-known study of diffusions on deterministic and random fractals: MR 96e:60002, MR 98i:60072.

The additive coalescent

A more recent and perhaps more surprising aspect of CRTs is their application to stochastic coalescence. developed in recent work with Jim Pitman. For $0< \lambda < \infty$ split the Brownian CRT into components at the points of a Poisson process of rate $\lambda$ along the skeleton of the tree. This gives a vector $Y(\lambda) = (Y_1(\lambda),Y_2(\lambda),......)$ of masses of the components, which as $\lambda$ increases specifies a fragmentation process. In paper [82] it is shown that reversing the direction of time by setting $\lambda = e^{-t}$ provides a construction of an interesting Markov process, the standard additive coalescent. In paper [87]. we introduce a new family of inhomogeneous continuum random trees by generalizing the line-breaking construction of the Brownian CRT. The fragmentation process for this family yields more general instances of the additive coalescent. As with the Brownian CRT, these inhomogeneous CRTs have equivalent descriptions via the laws of spanning subtrees: see paper [88] for this and other combinatorial aspects.

ISE

One can embed the Brownian CRT into $d$-dimensional space by regarding the tree as an index-set of tree-indexed Brownian motion with values in $d$ dimensions, or equivalently as a weak limit of rescaled critical branching random walks in $d$ dimensions. The resulting random measure, integrated superBrownian excursion (ISE), was introduced in paper [66]: its support is a tree in dimensions $d \geq 8$ but not in lower dimensions. See Borgs et al (Ann. Comb 3 (1999) 205-221) for a more combinatorial description. As described in [66] it is natural to believe that in high dimensions ISE is a limit of uniform random lattice trees. This has been proved by Derbez and Slade MR 98i:82025. More remarkably, Hara and Slade recently show that near-critical percolation clusters in high dimensions are asymptotically like ISE: Electron Res. Announc. Amer. Math. Soc. 4, 48-55.

In a somewhat different direction, a large deviation priciple for ISE (interesting because it involves the optimal traveling salesman tour length) argued heuristically in [66] and proved rigorously by Dembo and Zeitouni MR 97d:60051 and Serlet MR 98j:60041.

Limit laws for random trees

The general result of [56] giving the Brownian CRT as a weak limit of random $n$-vertex trees encompasses a wide variety of distributional limit theorems, incorporating different combinatorial models and different functionals. Various special cases, together with explicit formulas for the law of the functional of Brownian excursion arising as the limit, can be found in papers of Takacs MR 93g:05127 MR 94m:60194. The technically intricate case of Brownian excursion local time appearing as the limit of the "height profile" of random trees has been proved by Drmota - Gittenberger Random Structures Alg 10 421-451.

Limits of other combinatorial structures

Heuristically, one expects the Brownian CRT to arise as a weak limit of finite random trees for quite a wide variety of models of "random trees", not just the conditioned Galton-Watson trees of [56]. Camarri and Pitman paper show this for a range of birthday trees. In [61] MR 95k:60055 (with Jim Pitman) we considered uniform random mappings on a $n$-set, and showed that weak limit results could be coded in terms of reflecting Brownian bridge.

Trees in $d$ dimensions

The work above refers to "abstract" trees rather than the (much more complicated) case of trees in $d$ dimensions whose definition makes essential use of $d$-dimensional geometry. A recent preprint of Aizenman et al discusses natural examples of trees in the two-dimensional plane, and prove tightness results strongly suggestive of existence of "canonical" limit continuum trees in $R^2$.

Superprocesses and historical processes

xxx to be written.