11 Markov Chain Monte Carlo (January 8 2001)

11.2 The two basic schemes

We will present general definitions and discussion in the context of finite-state chains on a state space SS; translating to continuous state space such as RdR^{d} involves slightly different notation without any change of substance.

11.2.1 Metropolis schemes

Write K=(kxy)K=(k_{xy}) for a proposal transition matrix on SS. The simplest case is where KK is symmetric (kxykyxk_{xy}\equiv k_{yx}). In this case, given π\pi on SS we define a step xxx\to x^{\prime} of the associated Metropolis chain in words by

  • pick yy from k(x, )k(x,\cdot) and propose a move to yy;

  • accept the move (i.e. set x=yx^{\prime}=y) with probability min(1,πy/πx)\min(1,\pi_{y}/\pi_{x}), otherwise stay (x=xx^{\prime}=x).

This recipe defines the transition matrix PP of the Metropolis chain to be

pxy=kxymin(1,πy/πx), yx.p_{xy}=k_{xy}\min(1,\pi_{y}/\pi_{x}),\quad y\neq x.

Assuming KK is irreducible and π\pi strictly positive, then clearly PP is irreducible. Then since πxpxy=kxymin(πx,πy)\pi_{x}p_{xy}=k_{xy}\min(\pi_{x},\pi_{y}), symmetry of KK implies PP satisfies the detailed balance equations and so is reversible with stationary distribution π\pi.

The general case is where KK is an arbitrary transition matrix, and the acceptance rule becomes

  • accept a proposed move xyx\to y with probability min(1,πykyxπxkxy)\min(1,\frac{\pi_{y}k_{yx}}{\pi_{x}k_{xy}}).

The transition matrix of the Metropolis chain becomes

pxy=kxymin(1,πykyxπxkxy), yx.p_{xy}=k_{xy}\min\left(1,\frac{\pi_{y}k_{yx}}{\pi_{x}k_{xy}}\right),\quad y% \neq x. (11.5)

To ensure irreducibility, we now need to assume connectivity of the graph on SS whose edges are the (x,y)(x,y) such that min(kxy,kyx)>0\min(k_{xy},k_{yx})>0. Again detailed balance holds, because

πxpxy=min(πxkxy,πykyx), yx.\pi_{x}p_{xy}=\min(\pi_{x}k_{xy},\pi_{y}k_{yx}),\quad y\neq x.

The general case is often called Metropolis-Hastings – see Notes for terminological comments.

11.2.2 Line-sampling schemes

The abstract setup described below comes from Diaconis [124]. Think of each SiS_{i} as a line, i.e. the set of points in a line.

Suppose we have a collection (Si)(S_{i}) of subsets of state space SS, with iSi=S\cup_{i}S_{i}=S. Write I(x):={i:xSi}I(x):=\{i:x\in S_{i}\}. Suppose for each xSx\in S we are given a probability distribution iw(i,x)i\to w(i,x) on I(x)I(x), and suppose

 if x,ySix,y\in S_{i} then w(i,x)=w(i,y)w(i,x)=w(i,y).\mbox{ if $x,y\in S_{i}$ then $w(i,x)=w(i,y)$}. (11.6)

Write π[i]()=π(|Si)\pi^{[i]}(\cdot)=\pi(\cdot|S_{i}). Define a step xyx\to y of the line-sampling chain in words by

  • choose ii from w(,x)w(\cdot,x);

  • then choose yy from π[i]\pi^{[i]}.

So the chain has transition matrix

pxy=iI(x)w(i,x)πy[i], yx.p_{xy}=\sum_{i\in I(x)}w(i,x)\pi^{[i]}_{y},\quad y\neq x.

We can rewrite this as

pxy=iI(x)I(y)w(i,x)πy/π(Si)p_{xy}=\sum_{i\in I(x)\cap I(y)}w(i,x)\pi_{y}/\pi(S_{i})

and then (11.6) makes it clear that πxpxy=πypyx\pi_{x}p_{xy}=\pi_{y}p_{yx}. For irreducibility, we need the condition

the union over ii of the edges in the complete graphs
on SiS_{i} form a connected graph on SS.\mbox{on $S_{i}$ form a connected graph on $S$}. (11.7)

Note in particular we want the SiS_{i} to be overlapping, rather than a partition.

This setting includes many examples of random walks on combinatorial sets. For instance, card shuffling by random transpositions (yyy cross-ref) is essentially the case where the collection of subsets consists of all 22-card subsets. In the RdR^{d} setting, with target density ff, the Gibbs sampler is the case where the collection consists of all lines parallel to some axis. Taking instead all lines in all directions gives the hit-and-run sampler, for which a step from xx is defined as follows.

  • Pick a direction uniformly at random, i.e. a point yy on the surface on the unit ball.

  • Step from xx to x+Uyx+Uy, where -<U<-\infty<U<\infty is chosen with density proportional to

    ud-1f(x+uy).u^{d-1}f(x+uy).

The term ud-1u^{d-1} here arises as a Jacobean; see Liu [237] Chapter 8 for explanation and more examples in RdR^{d}.