Jacob Steinhardt (jsteinhardt@berkeley)

My goal is to make the conceptual advances necessary for machine learning systems to be reliable and aligned with human values. This includes the following directions:
  • Robustness: How can we build models robust to distributional shift, to adversaries, to model mis-specification, and to approximations imposed by computational constraints? What is the right way to evaluate such models?
  • Reward specification and reward hacking: Human values are too complex to be specified by hand. How can we infer complex value functions from data? How should an agent make decisions when its value function is approximate due to noise in the data or inadequacies in the model? How can we prevent reward hacking--degenerate policies that exploit differences between the inferred and true reward?
  • Scalable alignment: Modern ML systems are often too large, and deployed too broadly, for any single person to reason about in detail, posing challenges to both design and monitoring. How can we design ML systems that conform to interpretable abstractions? How do we enable meaningful human oversight at training and deployment time despite the large scale? How will these large-scale systems affect societal equilibria?
These challenges require rethinking both the theoretical and empirical paradigms of ML. Theories of statistical generalization do not account for the extreme types of generalization considered above, and decision theory does not account for cases where the reward function is only approximate. Meanwhile, measuring empirical test accuracy on a fixed distribution is insufficient to analyze phenomena such as robustness to distributional shift.

I seek students who are technically strong, broad-minded, and want to improve the world through their research. I particularly value creative, curious thinkers who are excited to revisit the conceptual foundations of the field.

Outside of research, I am a coach for the USA Computing Olympiad and an instructor at the Summer Program in Applied Rationality and Cognition. I also consult part-time for the Open Philanthropy Project. I like indoor bouldering and ultimate frisbee.


I will be joining the Statistics faculty at UC Berkeley in Fall of 2019, where I will also be a member of the Berkeley Artificial Intelligence Lab and of the EECS department (by courtesy). I recently finished a PhD in machine learning at Stanford University working with Percy Liang. Over the next year I will spend some time working at the Open Philanthropy Project and at OpenAI.


I maintain two blogs, an expository blog as well as a daily research log (somewhat out of date).


Research as a Stochastic Decision Process (December 2018) [link]
Long-Term and Short-Term Challenges to Ensuring the Safety of AI Systems (June 2015) [link]
The Power of Noise (June 2014) [link]
A Fervent Defense of Frequentist Statistics (February 2014) [link]
Beyond Bayesians and Frequentists (October 2012) [link]

PhD Thesis

Jacob Steinhardt
Robust Learning: Information Theory and Algorithms


(asterisk indicates joint or alphabetical authorship)

Kensen Shi, Jacob Steinhardt, and Percy Liang
FrAngel: Component-Based Synthesis with Control Structures
POPL 2019

Pang Wei Koh*, Jacob Steinhardt*, and Percy Liang
Stronger Data Poisoning Attacks Bypass Data Sanitization Defenses

Aditi Raghunathan, Jacob Steinhardt, and Percy Liang
Semidefinite Relaxations for Certifying Robustness to Adversarial Examples
NIPS 2018

Zachary C. Lipton* and Jacob Steinhardt*
Troubling Trends in Machine Learning Scholarship
[Paper] [Blog post (for comments)]

Ilias Diakonikolas, Gautam Kamath, Daniel Kane, Jerry Li, Jacob Steinhardt*, and Alistair Stewart
Sever: A Robust Meta-Algorithm for Stochastic Optimization

Miles Brundage, Shahar Avin, Jack Clark, et al.
The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation

Aditi Raghunathan, Jacob Steinhardt, and Percy Liang
Certified Defenses against Adversarial Examples
[Paper] [Open Reviews]
ICLR 2018

Pravesh Kothari* and Jacob Steinhardt*
Better Agnostic Clustering via Relaxed Tensor Norms
STOC 2018 (merged with Outlier-robust moment-estimation via sum-of-squares)

Jacob Steinhardt*, Pang Wei Koh*, and Percy Liang
Certified Defenses for Data Poisoning Attacks
NIPS 2017
[Paper] [Poster] [Code (git)] [Experiments (codalab)]

Jacob Steinhardt
Does Robustness Imply Tractability? A Lower Bound for Planted Clique in the Semi-Random Model

Jacob Steinhardt, Moses Charikar, and Gregory Valiant
Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers
ITCS 2018
[Paper] [Slides]

Moses Charikar*, Jacob Steinhardt*, and Gregory Valiant*
Learning from Untrusted Data
STOC 2017
[Paper] [Slides] [Poster]

Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané
Concrete Problems in AI Safety

Jacob Steinhardt, Gregory Valiant, and Moses Charikar
Avoiding Imposters and Delinquents: Adversarial Crowdsourcing and Peer Prediction
NIPS 2016

Jacob Steinhardt and Percy Liang
Unsupervised Risk Estimation Using Only Conditional Independence Structure
NIPS 2016
[Paper] [Older preprint]

Jacob Steinhardt*, Gregory Valiant*, and Stefan Wager*
Memory, Communication, and Statistical Queries
COLT 2016
[Paper] [ECCC preprint]

Jacob Steinhardt and Percy Liang
Learning with Relaxed Supervision
NIPS 2015
[Paper] [Code] [Poster]

Jacob Steinhardt and Percy Liang
Reified Context Models
ICML 2015
[Paper] [Code] [Slides] [Poster]

Jacob Steinhardt and Percy Liang
Learning Fast-Mixing Models for Structured Prediction
ICML 2015
[Paper] [Code] [Slides] [Talk] [Poster]

Jacob Steinhardt and John Duchi
Minimax Rates for Memory-Constrained Sparse Linear Regression
COLT 2015
[Paper] [Slides] [Talk] [Poster]

Tianlin Shi, Jacob Steinhardt, and Percy Liang
Learning Where to Sample in Structured Prediction
[Paper] [Code: GitHub/CodaLab] [Slides]

Jacob Steinhardt*, Stefan Wager*, and Percy Liang
The Statistics of Streaming Sparse Regression
arXiv preprint

Jacob Steinhardt and Percy Liang
Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm
ICML 2014
[Paper] [Slides] [Poster]

Jacob Steinhardt and Percy Liang
Filtering with Abstract Particles
ICML 2014
[Paper] [Slides] [Poster]

Jacob Steinhardt and Zoubin Ghahramani
Flexible Martingale Priors for Deep Hierarchies
[Paper] [Slides] [Poster]

Jacob Steinhardt and Zoubin Ghahramani
Pathological Properties of Deep Bayesian Hierarchies
2011 NIPS Workshop on Bayesian Nonparametrics
[Poster Abstract] [Poster]

Jacob Steinhardt and Russ Tedrake
Finite-Time Regional Verification of Stochastic Nonlinear Systems
Robotics: Science and Systems, 2011
Best Student Paper Finalist
[Conference Paper and Errata] [Journal Paper] [Slides] [Poster]

Jacob Steinhardt
Permutations with Ascending and Descending Blocks
Electronic Journal of Combinatorics, 17:R14
[Paper] [Slides]

Jacob Steinhardt
On Coloring the Odd-Distance Graph
Electronic Journal of Combinatorics, 16:N12

Jacob Steinhardt
Cayley Graphs Formed by Conjugate Generating Sets of S_n
3rd Place in 2007 Siemens Competition

Longer Talks

Learning with Memory and Communication Constraints
Learning with Intractable Inference and Partial Supervision

Past/Present Collaborators

Pang Wei Koh
Moses Charikar
Gregory Valiant
John Duchi
Tianlin Shi
Stefan Wager
Percy Liang
Zoubin Ghahramani
Russ Tedrake