Now let us consider the 3 parameter Kimura model (K3) for nucleotide substitutions. It is defined by the following infinitesimal generator
where > 0, >0, > 0 and In the Jukes-Cantor (JC) model, while case is the Kimura 2-parameter (K2) model. It is easy to deal with K3 as JC (at least with the following approach) and we now do so.
The transition matrix associated with Q is simply where t is in some time units. The eigenvalues of Q are . Therefore,
In the sense of JC, and this simplifies to . We conclude that the stationary distribution is , as has all its rows equal to this.
Notice that t is confounded with , and in the above expressions. Instead of studying or t separately, it is the product, , also known as the "amount of evolution", which is studied. More precisely, is the quantity that gives us the expected number of changes down a lineage over a time t.
Our interest in these models is in the table of joint probabilities of observed pairs of nucleotides at one position, following separate evolution from a common ancestor t time units back (e.g., ).
Letting X denote the nucleotide at the ancestral site,
where is the probability of X at the ancestral node, and , and . If these last two expressions have the JC form, not necessarily with the same , we can easily evaluate and similar expressions.
Our interest here is in obtaining simple expressions for , the probability that the two nucleotides differ.
It is easy to see that we can also write
since is reversible, and so we can derive the expression
Suppose that and have the JC form with parameters and . Then using the spectral representation of exp tQ, we easily check that when (the limiting frequency),
Thus we obtain the famous Jukes-Cantor correction for multiple substitutions in the form
an estimate of , the expected number of events separating the two taxa.