next up previous
Next: Substitution rate Up: Evolutionary models Previous: The Neutral Theory

Wright-Fisher reproduction

By far the most popular stochastic model for reproduction in population genetics is the Wright-Fisher model (developed implicitly by Fisher in [4] and earlier papers and explicitly by Wright[14]). It is, of course, highly idealized in the form presented below. However, it can be (and has been) generalized and its assumptions relaxed, and even in its pure form, it succeeds in capturing the essence of the biology involved.

Wright-Fisher has many of the same assumptions as Hardy-Weinberg Equilibrium, with the important exception of finite population size N (after all, it is the effects of sampling gametes in a finite population that we are interested in modeling). The really crucial assumptions are:

This is enough to generate the basic reproductive model, but here, being interested in the stochastic fate of a neutral variant, we will further assume neutrality (no selective differences between alleles) and no additional mutation during the sojourn of our variant. This gives us a complete description of the forces acting on alleles across generations, and we can now derive some simple (but as it turns out, quite powerful) results. Of particular interest is the probability of fixation of a mutant, the replacement of one allelic type with another on a population scale. Independent fixation events along separate lineages form the basic evidence for phylogenetic reconstruction of the relationships of those lineages.

First, note that because allelic variants are neutral, it makes no difference to the fate of individuals (and thus the descent of their alleles) how the alleles are distributed among individuals. We can consider a haploid model then, as equivalent to a diploid model and in general, no matter what the ploidy level, consider the``individuals" in the model to be gametes, without regard to their arrangement in the organisms themselves. Because most organisms of biological study are diploidgif, we will keep that convention, but the only effect in this case is that our population size is 2N gametes, rather than N. The remaining necessary specifications to the model are as follows:

2 alleles, and
number of alleles at time t

In the Wright-Fisher model, we imagine that gametes are chosen randomly each generation from an effectively infinite gamete pool reflecting the parental allele frequenciesgif. Then the sampling is binomial, and

Recall that one of the implications of Hardy-Weinberg was that under random mating and absent any directional perturbing forces such as mutation and selection, genetic systems will be at a stable equilibrium. Here, although we are allowing stochastic fluctuations in from generation-to-generation sampling, there is no directionality expected in the changes. This, plus the observation that is Markovian justifies the assertion that

and now we see that is also a martingale, with two possible limits, 0 and 2N. We can further write

and derive

It follows from the stopping time theorem for bounded martingales that the probability of being absorbed at either of the two boundaries is

We are interested mainly in the situation where has entered a monomorphic population (through, perhaps, mutation). This result tells us that when the new mutant enters the population (in a single copy, ), the probability that it eventually fixes and replaces the resident is its frequency, .

There are other ways to derive this result, one being to solve the Markov chain directly. Another makes use of the ``coalescent" reasoning described earlier by considering the genealogy of alleles in the following way: at time 0, there will be 2N gametes in the population, any of which might or might not leave descendants in the next generation. If they do not, the lineage of that allele copy is extinct in the population. If we follow the population through time, eventually all but one of the 2N original lineages will be extinct, and the remaining one will be fixed in the population. Because all of the original gametes have equal probability of generating the surviving lineage, the fixation probability of any allelic type is simply the frequency of that type. Although this is simply a verbal argument, the genealogical perspective underlying it is an extremely powerful one in analyzing molecular sequence data, and it is thus worth thinking about some long-solved problems in this way.

next up previous
Next: Substitution rate Up: Evolutionary models Previous: The Neutral Theory

Simon Cawley
Tue May 12 11:50:21 PDT 1998