Department of Statistics
University of California
Berkeley, California
Spring Semester 2000

Statistics 260
Statistical Genetics (Statgen)
Units: 4.0

Instructor: Terry Speed
Office Hours: 327 Evans Hall. To Be Announced.
email: terry@stat.berkeley.edu

Course Outline

Modelling meiosis, linkage mapping,pedigree analysis, genetic epidemiology. Clone libraries, physical mapping of chromosomes. Radiation hybrid mapping. DNA and protein sequence analysis, molecular evolution, sequence alignment, database searching. Analysis of microarray expression data.

SYLLABUS

Material will be selected from the following list to reflect the interests of those taking the course.

Mendel and segregation: discussion of his results. Meiosis. Stochastic models for recombination, including no chromatid interference and the chi-square models. Multilocus mapping in experimental crosses (backcross and F2 intercross); the Lander-Green hidden Markov model for calculating probabilities. Mouse experiments : mapping genes for qualitative and quantitative traits using genome-wide scans. Classification and regression trees to identify interactions. Genetic epidemiology: association and linkage in various settings, including case-control and non-transmitted chromosome controls. Affected sib-pair methods and their extensions: parameter spaces, score tests; the Haseman-Elston test and an alternative. Multilocus mapping in pedigrees: the Lander-Green algorithm again, for modest-sized pedigrees. Calculation of probabilities and likelihoods on large pedigrees by the Elston and Stewart algorithm, by Markov chain Monte Carlo, and other approaches. DNA sequencing and the polymerase chain reaction. Clones and clone maps. Lander and Waterman theory and its extensions for contig statistics. STS content mapping and other approaches to physical mapping. Simple statistics of DNA sequences: base and k-mer composition. Profiles. Hidden Markov models for finding genes in DNA sequence. Markov process models for molecular evolution. Inference with phylogenies. Sequence alignment: local, global, pairwise and multiple. Dynamic programming algorithms for optimal alignments. Database searching: precise and heuristic algorithms. Gene families. Analysis of microarray expression data. Precision, reliability, discriminant and cluster analysis.

Rationale

Statistical methods play an important role in a number of aspects of modern genetics, such as linkage and other types of mapping, biomolecular sequence analysis, and the analysis of gene expression data. There is a need for statisticians with knowledge of the use of statistics in this field, and for some to go on to carry out research there. Further, there will be students and researchers in the field desiring a better understanding of this area of application of statistics.

Prerequisites

Statistics 200A/B or equivalent.

Texts

READING LIST

Ott, J. Analysis of human genetic linkage. Rev. ed. Johns Hopkins, 1991.

Waterman, M S ``Introduction to Computational Biology: Maps, sequences and genomes'' Chapman and Hall, London, 1995.

Other references will be given during the course and notes will also be provided.