PH 296
Fall 2001
Index
Home
Seminar
Discussion
|
Home
-Discussion
Discussion - Fall 2001
Monday, November 26th
Pattern Discovery in Protein Sequences
Katerina Kechris
Department of Statistics, UC Berkeley
The rapid growth of sequence databases has motivated the development of
techniques to identify similarities in related sequences. For example,
conserved positions across a family of protein sequences may indicate
sites which are functionally or structurally important. These sites may
remain constant because of selective pressures, while other, non-essential
sites are more likely to tolerate mutations over evolutionary time. Once
common features are extracted from a family of structurally, functionally
or evolutionarily related sequences, they can be used for classification
of new examples.
There are a variety of methods for automatic pattern discovery. In this
talk, I will discuss several approaches separated into two sections
according to the pattern type. Deterministic patterns match or do not
match a sequence. These are found by enumerating the solution space of the
defined pattern, in the language of regular expressions. Probabilistic
patterns assign a probability or score to the match between a sequence and
the pattern. These may be found by fitting a statistical model. This
talk is not meant to be an exhaustive survey of the different methods, but
rather, an introduction into the different approaches through several
illustrative examples.
Handout: ps
pdf
Monday, November 19th
Gene Finding with Hidden Markov Models
Marina Alexandersson
Department of Statistics, UC Berkeley
A fundamental task in analyzing genomes is to annotate various
features of biological importance. While this is relatively straight
forward for organisms with compact genomes (such as bacteria or
yeast), it becomes much more challenging for large genomes (such as
mammals) because the coding "signal" is scattered in a vast sea of
non-coding "noise".
Hidden Markov models (HMMs) have been successfully applied to a
variety of problems in molecular biology, ranging from alignment
problems to gene finding and annotation.
In this talk we discuss the various forms and algorithms of HMMs used
in sequence analysis, including pair HMMs (PHMMs) and generalized HMMs
(GHMMs), and we show the pros and cons of extending the
theory to cross-species gene recognition.
Monday, November 12th
No class, Veterans Day.
Monday, October 29th and November 5th
Talk
on linkage analysis, Spring 2001: ps
pdf
Monday, October 22nd
Reading:
A. P. Dempster, N. M. Laird, and D. B. Rubin. (1977). Maximum
Likelihood from Incomplete Data via the EM
Algorithm J. R. Statist. Soc. B. 39(1): 1-38. Download from JSTOR.
Notes on the EM algorithm (ps)
Monday, October 15th
Reading:
Y. Benjamini and Y. Hochberg. (1995). Controlling the false
discovery rate: a practical and powerful approach to multiple
testing. J. R. Statist. Soc. B. 57: 289-300. Download from JSTOR.
J. P. Shaffer. (1995). Multiple hypothesis
testing. Annu. Rev. Psychol. 46: 561-584.
Monday, October 8th
Reading:
L. R. Rabiner. (1989). A tutorial on hidden Markov models and selected
applications in speech recognition. Proceedings of the IEEE. 77 (2):
257-286.
H. M. Taylor and S. Karlin. (1984). An introduction to stochastic modeling. Academic Press.
Monday, October 1st
Lecture by Lior Pachter: more on Steiner trees and sequence alignment.
Monday, September 24th
Reading:
S. B. Needleman & C. D. Wunsch. (1970). A general method applicable
to the search for similarities in the amino acid sequences of two proteins.
Journal of Molecular Biology .48: 443-453.
T. F. Smith & M. S. Waterman. (1981). Identification of common
molecular subsequences. Journal of Molecular Biology. 147: 195-197.
L. R. Rabiner. (1989). A tutorial on hidden Markov models and selected
applications in speech recognition. Proceedings of the IEEE. 77 (2):
257-286.
Monday, September 17th
Reading:
Robust local regression
W. S. Cleveland. (1979). Robust localy weighted regression and smoothing
scatterplots. Journal of the American Statistical Association. 74: 829-836.
Download from JSTOR.
Design of microarray experiments
M. K. Kerr & G. A. Churchill. (2001). Experimental Design for Gene
Expression Microarrays. Biostatistics. 2: 183-201.
Download from Gary
Churchill's webpage or Biostatistics
website.
Thursday, September 13th
Slides: Pre-processing
in DNA microarray experiments (ppt)
Links:
Terry's
Speed's Microarray Data Analysis Group Page
Sandrine Dudoit's
Homepage (more links ...)
Monday, September 10th
Slides: Introduction
to the biology and technology of DNA microarrays (ppt)
Links:
Human
Genome Project Education Resources
DNA
Microarray Methodology Animation
The Chipping Forecast,
Nature Genetics, Vol. 21, supp. p. 1-60.
To
top
last updated November 27, 2001
sandrine@stat.berkeley.edu
|