Adaptive Introgression
Until recently, there was no genetic evidence that modern humans and archaic humans had contact. But now we know that not only was there admixture between the two hominin species, but that the intermixing may have been particularly useful. We find that the gene underlying the genetic basis to high altitude in adaptation in Tibetans harbors a Denisovan-like haplotype at high frequency. This and other findings suggest that genetic exchange between populations or between species might be more important than previously thought.

Adaptation to high-altitude
Adaptation to high altitude has long been a clear example of positive selection in humans. I scan genomes using novel approaches to identify the genes that have likely contributed to people's ability to thrive at extreme altitudes. Convincing candidate genes may provide insights to many diseases related to hypoxia as well as inform us about gene function and evolution.

Characterizing Adaptive Events
I am interested in estimating the timing and strength of selection for various models of natural selection: (1) selection on a de novo mutation (the selection becomes immediately beneficial when it enters the population), (2) selection on standing variation (a mutation may be neutral or weakly deleterious in the population, but due to an environmental change, the mutation becomes beneficial).

Natural selection on the X chromosome
How much of human genetic variation is neutral, deleterious or beneficial still remains an open question. I am interested in understanding how natural selection (both positive and negative), demographic histories and cultural practices have shaped genetic variation in the genome. I am particularly interested in how these forces have impacted the genetic variation on the X chromosome. Since it is present in one copy in males and two in females, the X chromosome may shed light on how these processes affect genetic variation, as different theoretical predictions are expected for this chromosome, compared to the autosomes.

Next Generation Sequencing (NGS)

With the advent of NGS, it is now possible to sequence multiple whole genomes. However, this brings new challenges as most of the data generated is low coverage (less than 10X per individual) and sequencing error rates are higher than more traditional technologies. These characteristics make calling genotypes challenging. Furthermore, allele frequencies estimated from called genotypes are very often innacurate, and this leads to biased estimates of population genetic parameters. I enjoy developing statistical techniques to estimate frequencies directly from NGS reads.

The Beta Coalescent
An underlying assumption for Kingman's coalescent is that the number of offspring are binomially distributed with finite variance as the population size tends to infinity. A more realistic model for some marine species with very large family sizes is the beta coalescent. We modified Hudson's backward simulator ms to adjust for large family sizes and compared results to theoretical predictions of the allele frequency spectrum.

Closely related populations
Many of the commonly used methods to estimate demographic parameters break down when two population have recently diverged. I am currently working on characterizing which methods or measures of genetics variation perform better when populations are closely related. This is pertinent because it will help us better estimate the divergence time between the Tibetan and Han populations. Estimating the divergence time between these two populations may reconcile estimates based on genetic evidence and those based on archeaological evidence.