My research focuses on topics such as ensemble learning, learning from imbalanced binary-outcome data, influence curve based variance estimation and statistical computing. I focus my efforts mainly at the intersection of machine learning and computing, where I working on scaling machine learning algorithms via software, hardware and algorithmic approaches.

At Berkeley's Division of Biostatistics, my applied reserach focused on predicting virologic failure in AIDS patients using machine learning. Virologic failure can be predicted (thus, possibly prevented) using a combination of clinical variables and medication adherence data which is collected through a Medication Event Monitoring System (MEMS), or an electronic pillbox.

In Lawrence Berkeley National Lab's Computational Research Division, I have worked on detecting extra tropical cyclones using massive climate data sets.


LeDell E, van der Laan MJ, Peterson M. AUC-Maximizing Ensembles through Metalearning. The International Journal of Biostatistics. 2016;12(1):203-218.

Arno Candel & Erin LeDell. Deep Learning with H2O (2015)

Maya L. Petersen, Erin LeDell, Joshua Schwab, Varada Sarovar, Robert Gross, Nancy Reynolds, Jessica E. Haberer, Kathy Goggin, Carol Golin, Julia Arnsten, Marc Rosen, Robert Remien, David Etoori, Ira Wilson, Jane M. Simoni, Judith A. Erlen, Mark J. van der Laan, Honghu Liu, David R Bangsberg. Super Learner Analysis of Electronic Adherence Data Improves Viral Prediction and May Provide Strategies for Selective HIV RNA Monitoring. Journal of Acquired Immune Deficiency Syndromes (JAIDS), 2015 May 1;69(1):109-18. Preprint

LeDell, Erin; Petersen, Maya; van der Laan, Mark. Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates. Electron. J. Statist. 9 (2015), no. 1, 1583–1607.

Erin LeDell, Prabhat, Dmitry Yu. Zubarev, Brian Austin and William A. Lester Jr Classification of nodal pockets in many-electron wave functions via machine learning. Journal of Mathematical Chemistry, 50(7):2043-2050, 2012.


LeDell, Erin. Ensembles in H2O. H2O World, November 2015. Mountain View, CA.

LeDell, Erin. High Performance Machine Learning in R with H2O. ISM HPC on R Workshop, October 2015. Tokyo, Japan.

LeDell, Erin. h2oEnsemble: Scalable Ensemble Learning in R. useR! Conference, July 2015. Aalborg, Denmark.

LeDell, Erin. Intro to Practical Ensemble Learning. D-Lab @ UC Berkeley, April 2015.

LeDell, Erin. Ensembles in H2O. H2O World, November 2014. Mountain View, CA.

LeDell, Erin; Sapp, Stephanie; and van der Laan, Mark. subsemble: Ensemble learning in R with the Subsemble algorithm. useR! Conference, July 2014. Los Angeles, CA.

Petersen, M.; LeDell, E.; Martin, J.; Hunt, P.; Haberer, J.; Muzoora C.; van der Laan, M.; Bangsberg, D. Data Adaptive Super Learning to Predict Virologic Failure Based on Electronic Adherence Monitoring in Uganda. 18th International Workshop on HIV Observational Databases. March 2014, Sitges, Spain.

LeDell, Erin; Petersen, Maya L.; and van der Laan, Mark J. Computationally Efficient Confidence Intervals for Cross-validated AUC Estimates. Joint Statistical Meetings, August 2013. Montréal, Canada.

Dr. Maya Petersen, Ms. Varada Sarovar, Ms. Erin LeDell, Mr. Joshua Schwab, Dr. Robert Gross, Dr. Ira Wilson, Dr. Carol Golin, Dr. Nancy Reynolds, Dr. Robert Remien, Dr. Jane Simoni, Dr. Marc Rosen, Dr. Mark van der Laan, Dr. Honghu Liu, Dr. David Bangsberg. Data-Adaptive Super Learning to Predict Viral Rebound based on Electronic Adherence Monitoring: An Analysis of the MACH-14 Cohort Consortium. 7th International Conference on HIV Treatment and Prevention Adherence, June 2012. Miami, FL.

Michael F. Wehner, Prabhat, Surendra Byna, Fuyu Li, Erin LeDell, Thomas Yopes, Gunther Weber, Wes Bethel, William D. Collins. TECA, 13TB, 80,000 processors -- Or: Characterizing extreme weather in a changing climate. Second Workshop on Understanding Climate Change from Data, August 2012. University of Minnesota.