Statistical Mechanics Methods for Discovering Knowledge from ProductionScale Neural Networks
(A Tutorial at KDD 2019)
August 4th 2019.
1:00PM  5:00PM, Summit 8, Ground Level, Egan
To Prepare for the Tutorial: Part of the tutorial will cover the weightwatcher tool, which we have developed and which can be used to reproduce and extend our results. In preparation for the tutorial, you should "pip install weightwatcher" and/or go to our WeightWatcher repo for more information, demo notebooks, etc.
Link to the slides:
click here.

Charles Martin
Charles Martin
holds a PhD in Theoretical Chemistry from the University of Chicago.
He was then an NSF Postdoctoral Fellow and worked in a Theoretical Physics group at UIUC that studied the statistical mechanics of Neural Networks.
He currently owns and operates
Calculation Consulting,
a boutique consultancy specializing in ML and AI, supporting clients doing applied research in AI.
He maintains a wellrecognized blog on practical ML theory and he has to date supported and performed the work on Implicit and Heavy Tailed Self Regularization in Deep Learning.


Michael Mahoney
Michael Mahoney is at ICSI and Department of Statistics at UC Berkeley.
He works on algorithmic and statistical aspects of modern largescale data analysis.
He is a leader in Randomized Numerical Linear Algebra; he led the largest largescale empirical evaluation of community structure in social and information networks; he has developed implicit regularization methods and scalable optimization methods for convex and nonconvex problems; and he has applied these methods and complementary RMT methods to DNN problems.

The tutorial will review recent developments in using techniques from statistical mechanics to understand the properties of modern deep neural networks.
Although there have long been connections between statistical mechanics and neural networks, in recent decades connections have withered.
In light of recent failings of traditional statistical learning theory and stochastic optimization theory even to qualitatively describe many properties of production quality deep neural network models, researchers have revisited ideas from the statistical mechanics of neural networks.
The tutorial will provide an overview of the area; it will go into detail on how connections with heavy tailed random matrix theory can lead to a practical phenomenological theory for largescale deep neural networks; and it will describe future directions.
More details can be found here.
Several relevant papers, including code to reproduce the results:
HeavyTailed Universality Predicts Trends in Test Accuracies for Very Large PreTrained Deep Neural Networks,
C. H. Martin and M. W. Mahoney,
Technical Report, Preprint: arXiv:1901.08278 (2019)
(arXiv),
(code),
Traditional and HeavyTailed Self Regularization in Neural Network Models,
C. H. Martin and M. W. Mahoney,
Technical Report, Preprint: arXiv:1901.08276 (2019)
(arXiv),
(code),
Accepted for publication, Proc. ICML 2019.
Implicit SelfRegularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning,
C. H. Martin and M. W. Mahoney,
Technical Report, Preprint: arXiv:1810.01075 (2018)
(arXiv),
(code),
Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior,
C. H. Martin and M. W. Mahoney,
Technical Report, Preprint: arXiv:1710.09553 (2017)
(arXiv),
Several related presentations:
Charles talking
at LBNL, June 2018.
Michael talking
at the Simons' Institute 2018 Big Data RandNLA meeting, Sept 2018.
Charles talking
at ICSI, December 2018.
Michael talking
at SF Bay Area DMSIG Meeting, February 2019.
