Currently I am the Chief Machine Learning Scientist at, the company behind the open source machine learning software, H2O.

I received my Ph.D. in Biostatistics with a Designated Emphasis in Computational Science and Engineering from UC Berkeley. I have a B.S. and M.A. in Mathematics and have worked for many years in industry as a software developer and data scientist. My research focuses on ensemble machine learning, learning from imbalanced binary-outcome data, influence curve based variance estimation and statistical computing.

My dissertation is titled, "Scalable Ensemble Learning and Computationally Efficient Variance Estimation" and was awarded the 2015 Erich L. Lehmann Citation by the UC Berkeley Department of Statistics.

At Berkeley, I was co-advised by Mark J. van der Laan and Maya L. Petersen.

I also co-created an online course called Machine Learning with Tree-Based Models in R on DataCamp, an interactive platform to learn R and data science.

You can reach me at <>, or find me on GitHub, Twitter.

UC Berkeley