My research and teaching activities concern the development and
application of statistical methods and software to address
problems in biomedical and genomic research.
Applications to biomedical and genomic research.
Design and analysis of high-throughput gene expression
experiments
based on next-generation sequencing: mRNA-Seq for transcriptome
analysis and genome annotation.
Design and analysis of high-throughput gene expression
experiments based on DNA microarrays: mRNA-Chip for
transcriptome analysis and genome annotation;
alternative splicing microarrays; ChIP-Chip for DNA-protein interaction
profiling, e.g., transcription
factor binding; metagenomics microarrays (16s small-subunit rRNA
microarrays) for the quantitative detection of microorganisms in
complex environmental and medical samples.
Nucleotide and protein sequence analysis: identification
of
regulatory motifs in DNA sequences.
Genetic mapping of complex traits: IBD-based linkage
analysis; linkage disequilibrium analysis; SNP-based association
studies; microarray-based genetic mapping studies of gene expression.
Analysis of biological annotation metadata: e.g., Gene
Ontology (GO) annotation.
Statistical methodology.
Loss-based estimation with cross-validation: parametric
and
non-parametric density estimation and regression, variable selection.
Multiple hypothesis testing: resampling-based multiple
testing
procedures for controlling generalized Type I error rates, defined as
tail
probabilities and expected values for arbitrary functions of the
numbers of Type I errors and rejected hypotheses (e.g., false discovery
rate).
Structured tests in high dimensions.
Statistical computing.
I am a core
developer
of the Bioconductor
Project, an open-source and open-development software project for
the analysis of biomedical and genomic data.