__2022__* I am giving a talk at the University of Alberta Statistics Department Seminar on October 26th.

* I am giving a talk at the EPFL Fundamentals of Learning and Artificial Intelligence Seminar on September 30th.

* I am a visiting scientist at EPFL in September and October, hosted by Emmanuel Abbe.

* I am giving a talk at the Joint Statistical Meetings about benign overfitting without linearity.

* Benign overfitting without linearity was accepted at COLT 2022.

* I am an organizer for the Deep Learning Theory Summer School and Workshop, to be held this summer at the Simons Institute.

* I will be speaking at the ETH Zurich Data, Algorithms, Combinatorics, and Optimization Seminar on June 7th.

* I will be a keynote speaker at the University of Toronto Statistics Research Day on May 25th.

* I am giving a talk at Harvard University's Probabilitas Seminar on May 6th.

* Two recent works accepted at the Theory of Overparameterized Machine Learning 2022 workshop, including one as a contributed talk.

* I am giving a talk at the Microsoft Research ML Foundations Seminar on April 28th.

* I am giving a talk at the University of British Columbia (Christos Thrampoulidis's group) on April 8th.

* I am giving a talk at Columbia University (Daniel Hsu's group) on April 4th.

* I am giving a talk at Oxford University (Yee Whye Teh's group) on March 23rd.

* I am giving a talk at the NSF/Simons Mathematics of Deep Learning seminar on March 10th.

* I am giving a talk at the Google Algorithms Seminar on March 8th.

* I'm reviewing for the Theory of Overparameterized Machine Learning 2022 workshop.

* Two new preprints with Niladri Chatterji and Peter Bartlett: Benign Overfitting without Linearity and Random Feature Amplification.

* Recent work on sample complexity of a self-training algorithm accepted at AISTATS 2022.

## Older news (click to expand)

__2021__* I am speaking at the Deep Learning Theory Symposium at the Simons Institute on December 6th.

* My paper on proxy convexity as a framework for neural network optimization was accepted at NeurIPS 2021.

* Two new preprints on arxiv: (1) Proxy convexity: a unified framework for the analysis of neural networks trained by gradient descent, and (2) Self training converts weak learners to strong learners in mixture models.

* I am reviewing for the ICML 2021 workshop Overparameterization: Pitfalls and Opportunities (ICMLOPPO2021).

* Three recent papers accepted at ICML, including one as a long talk.

* New preprint on provable robustness of adversarial training for learning halfspaces with noise.

* I will be presenting recent work at TOPML2021 as a lightning talk, and at the SoCal ML Symposium as a spotlight talk.

* I'm giving a talk at the ETH Zurich Young Data Science Researcher Seminar on April 16th.

* I'm giving a talk at the Johns Hopkins University Machine Learning Seminar on April 2nd.

* I'm reviewing for the Theory of Overparameterized Machine Learning Workshop.

* I'm giving a talk at the Max-Planck-Insitute (MPI) MiS Machine Learning Seminar on March 11th.

* New preprint showing SGD-trained neural networks of any width generalize in the presence of adversarial label noise.

__2020__* New preprint on agnostic learning of halfspaces using gradient descent is now on arXiv.

* My single neuron paper was accepted at NeurIPS 2020.

* I will be attending the IDEAL Special Quarter on the Theory of Deep Learning hosted by TTIC/Northwestern for the fall quarter.

* I've been awarded a Dissertation Year Fellowship by UCLA's Graduate Division.

* New preprint on agnostic PAC learning of a single neuron using gradient descent is now on arXiv.

* New paper accepted at

*Brain Structure and Function*from work with researchers at UCLA School of Medicine.

* I'll be (remotely) working at Amazon's Alexa AI group for the summer as a research intern, working on natural language understanding.

*2019** My paper with Yuan Cao and Quanquan Gu, "Algorithm-dependent Generalization Bounds for Overparameterized Deep Residual Networks", was accepted at NeurIPS 2019 (arXiv version, NeurIPS version).

## Click for paper summary

Benign overfitting, where a statistical model perfectly fits noisy training data yet still generalizes well, was first observed in neural networks trained by gradient descent. Reconciling this behavior with the long-standing intuition from statistics that overfitting is a hazard to be avoided has become a central task for statisticians and learning theorists.

In this work, we provide a characterization of benign overfitting in two-layer neural networks trained by gradient descent following random initialization. We prove that even when a constant fraction of training data have random labels, neural networks trained by gradient descent can achieve 100% training accuracy and simultaneously generalize near-optimally. In contrast to previous works that require either linear or kernel-based predictors, we characterize a benign overfitting phenomenon in a setting where both the model and learning dynamics are fundamentally nonlinear.