Projects and grants

Below is a list of current projects. Please also see the students’ pages for additional projects. Previous projects can also be found through the following links: censored data and causal inference, computational biology, data-adaptive learning, and multiple hypothesis testing.

Current Projects

  • Toxic Substances in the Environment: Quantitative Biology: Biostatistics, Bioinformatics, and Computation
    P.I. Martyn Smith, PhD

We provide investigators with consultative support in biostatistics, computational biology, and bioinformatics and to support web-based dissemination of bioinformatics solutions and database access.

  • Targeted Learning: Causal Inference Methods for Implementation Science
    P.I. Mark van der Laan, PhD

This project will extend Targeted Learning methodology to address HIV implementation science questions, including hierarchical data, complex dependencies between individuals and clusters, and small sample size. The resulting methods will be implemented as publicly available software, and applied to clinical cohort data from Southern and Eastern Africa and a large cluster randomized trial of antiretroviral-based HIV prevention in Kenya and Uganda.

  • PON Epigenetics and Neurodevelopment in Children
    P.I. Nina T. Holland, PhD

The goal of this grant is to highlight new findings strengthening our hypothesis that epigenetic mechanisms play a strong role on PON1 expression and health. Our approach will also serve as a model for future studies seeking to combine the effects of genetic and epigenetic factors on susceptibility genes related to environment, health, and disease.

  • Learning and Semi-parametric Causal Inference for Patient-centered Outcomes Research
    P.I. Alan E. Hubbard, PhD

The goal of this grant is to derive estimates of parameters that can utilize messy, high dimensional and evolving data, and also try to more closely match the decision-making of an experienced practitioner by providing dynamic prediction (changing prediction at iterative time points), calculating diagnostic indicators quickly, while evaluating the dynamic importance of variables.

  • A Rigorous System to Determine the Health Impacts of Policies and Programs
    P.I. Jennifer Ahern, PhD

The major goals of this project are relevant to public health because it will increase knowledge of how policies and programs affect health and disparities. The system for determination of health impacts will reduce the cost while improving the quality of assessment of health effects in the future. This project will bolster the capacity of the United States to improve and protect health by providing the evidence base to inform modification of current policies and design of future policies to protect health and reduce disparities.

  • Causal Inference for Effectiveness Research in Using Secondary Date
    The Patient Centered Outcomes Research Institute (PCORI), Brigham & Women’s Hospital
    P.I. Sebastian Schneeweiss, PhD

The goal of this grant is to identify and expand the library of predictive algorithms that will likely perform well in such data sources, provide solutions for any unexpected statistical issue arising during the project, support software development, provide general technical insights in to causal inference with secondary data sources.

  • Adaptive Strategies for Preventing & Treating Lapses of Retention in Care (AdaPT)
    NIH/NIMH (National Institute of Mental Health)
    P.I. Elvin Geng, PhD

Major objectives of this project are (1) to assess the comparative effectiveness of interventions to prevent lapses in retention, (2) to assess the effectiveness of interventions to re-engage patients with early lapse in retention, and (3) to assess the effectiveness and cost-effectiveness of sequential prevention—reengagement strategies for retention.

  • Preterm Birth Initiative
    Bill and Melinda Gates Foundation
    P.I. Maya Petersen, PhD

Our project focuses on the relationships exposures with the risk of preterm births and lower birthweight among full-term babies.

  • Healthy Birth, Growth, and Development knowledge Integration (HBGDki)
    Bill and Melinda Gates Foundation
    P.I. Shasha Jumbe, PhD

The purpose of the HBGDki collaboration is to facilitate rapid progress, increase the understanding of the many complex and interrelated issues that contribute to poor growth outcomes; and improve our ability promote healthy birth, growth, and development in the communities that need it most.

Current Grants

  • Targeted Learning: Causal Inference Methods for Implementation Science
    P.I. Mark van der Laan, PhD

This competitive renewal will develop general methods for evaluating the comparative effectiveness of alter- native strategies for HIV prevention, treatment and care in Southern and Eastern Africa. Large cluster random- ized trials and global cohort collaborations generate longitudinal data on hundreds of thousands of patients in real world settings. These provide a tremendous resource for developing the “practice-based evidence” needed to maximize the impact of HIV prevention strategies and to improve healthcare delivery systems. Realizing this potential, however, demands innovations to the field of Targeted Learning for maximally unbiased and efficient estimation of statistical parameters, best approximating the causal effects of interest.

First, improved methods for estimating the effects of patient responsive monitoring and treatment strategies must be developed. In the common settings of strong confounding and rare outcomes, current estimators suffer from bias, lack efficiency and have unreliable measures of uncertainty. Second, general causal models and identifiability assumptions must be developed for the joint effects of cluster and individual-level interventions over multiple time points. These models will account for interactions between individuals within clusters and potential contamination between clusters. This work will inform the optimal design for sampling clusters and measuring individuals within communities or clinics. Third, efficient and maximally unbiased estimators must be developed to evaluate the impact of cluster and individual-level interventions over multiple time points. Current methods are highly susceptible to bias and misleading inference due to model misspecification and due to the often incorrect assumption that the observed data represent n independent, identically distributed (i.i.d.) repetitions of an experiment. The developed methods will elucidate the pathways by which cluster- based interventions impact health, while remaining robust to the common challenges of sparsity, irregular and informative missingness, and few truly independent units (clusters) but potentially hundreds of thousands of conditionally independent units.

These innovations are motivated by our collaborations with the International epidemiologic Databases to Evaluate AIDS (IeDEA) in Southern (PI Dr. Egger) and Eastern Africa (PI Dr. Yiannoutsos) and the Sustain- able East Africa Research in Community Health (SEARCH) consortium (PI Dr. Havlir), a cluster randomized trial to evaluate the community-wide benefits of ART initiation at all CD4 counts. The developed methods will be applied to these data sources to investigate (i) strategies for monitoring antiretroviral therapy (ART) and guiding switches to second line regimens, (ii) the direct and indirect effects of a community-based HIV prevention strategy and (iii) the impact of clinic-based programs for delivering HIV care. Finally, the resulting estimators will be implemented as publicly available software packages and teaching papers written to explain the methodology in a clear and rigorous manner.