Michael Mahoney - Presentations

Talks and Presentations

Several recent presentations:

  • Why Deep Learning Works: Implicit Self-Regularization in Deep Neural Networks (Sept 2018, at Simons' Institute 2018 Big Data RandNLA meeting) (pdf)

  • Alchemist: An Apache Spark <=> MPI Interface (June 2018) (pdf)

  • Biomedicine & the Foundations of Data? (May 2018) (pdf)

  • Scientific Machine Learning with Alchemist (An Apache Spark <=> MPI Interface) and Beyond (Apr 2018) (pdf)

  • Numerically-intensive Machine Learning at Scale (Fall 2017) (pdf)

  • Second-order Machine Learning (Fall 2017) (pdf)

Several tutorial presentations:

  • Sampling for Linear Algebra, Statistics, and Optimization (Aug 2018, at Simons' Institute 2018 Big Data Bootcamp) (pdf)

  • RandNLA: Randomization in Numerical Linear Algebra (5.5 hr version at UW Madison Summer School, July 2018) (pdf)

  • Randomization in Numerical Linear Algebra: Theory and Practice (2.0 hr version at SIAM ALA Meeting, October 2015) (pdf)

  • Past, Present and Future of Randomized Numerical Linear Algebra: (3.0 hr version at Simons' Institute 2013 Big Data Bootcamp) (Part I: pdf, ppt and Part II: pdf, ppt)

  • Theory (and some practice) of Randomized Algorithms for Matrices and Data (tutorial from FOCS 2012 Workshop) (pdf, ppt)

  • Geometric Tools for Identifying Structure in Large Social and Information Networks (1.5 hr version at SAMSI Opening Workshop 2010, etc.) (pdf, ppt)

  • Geometric Tools for Identifying Structure in Large Social and Information Networks (2 hr version at ICASSP 2011, etc.) (pdf, ppt)

  • Geometric Tools for Identifying Structure in Large Social and Information Networks (3 hr version at ICML 2010 and KDD 2010, etc.) (pdf, ppt) (The pdf file in four pieces: here, here, here, and here.)

  • Randomized Algorithms for Matrices and Massive Data Sets (at SIAM-SDM06 2006 and VLDB 2006) (ppt)

  • Randomized Algorithms for Matrices and Massive Data Sets (at ACM-SIGKDD 2005) (ppt)

Older presentations:

  • UC Berkeley's FODA Institute: Foundations of Data Analysis (NSF TRIPODS Kickoff, Oct 2017) (pdf)

  • Second-order Machine Learning (Pre-fall 2017) (pdf)

  • Local graph analytics: beyond characterizing community structure (Spring 2017) (pdf)

  • Terabyte-scale Computational Statistics (talk Fall 2016) (pdf)

  • Scientific Matrix Factorizations in Spark at Scale: Cross-platform performance, scaling, and comparisons with C+MPI (talk at 2016 Dato Data Science Summit and elsewhere) (pdf)

  • Optimization Algorithms for Analyzing Large Datasets (talk at 2016 PCMI Summer School) (pdf)

  • Foundations of Data Science (talk at NSF pre-TRIPODS workshop/meeting, Apr 2016) (pdf)

  • Sub-Sampled Newton Methods (talk at ITA 2016 and elsewhere) (pdf)

  • Challenges in Multiresolution Methods for Graph-based Learning (talk, 3of3, from NIPS15 Workshops) (pdf)

  • Using Local Spectral Methods in Theory and in Practice (talk, 2of3, from NIPS15 Workshops) (pdf)

  • Column Subset Selection on Terabyte-sized Scientific Data (talk, 1of3, from NIPS15 Workshops) (pdf)

  • Linear and Sublinear Linear Algebra Algorithms: Preconditioning Stochastic Gradient Algorithms with Randomized Linear Algebra (DIMACS, Aug 2015) (pdf)

  • Overview of RandNLA: Randomized Numerical Linear Algebra (pdf)

  • Tree-like structure in social graphs (pdf (big=28MB), ppt (big=48MB))

  • Eigenvector localization, implicit regularization, and algorithmic anti-differentiation for large-scale graphs and network data (pdf, ppt)

  • Locally-biased and semi-supervised eigenvectors (talk from MMDS 2014) (pdf, ppt)

  • Implicit regularization in sublinear approximation algorithms (pdf, ppt)

  • BIG Biomedicine and the Foundations of BIG Data Analysis (at Big Data in Biomedicine at Stanford's Medical School, 5/23/14) (pdf, ppt)

  • Revisiting the Nystrom Method for Improved Large-Scale Machine Learning (pdf, ppt)

  • Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments (version from Simons Big Data Workshop II) (pdf)

  • Input-sparsity Time Algorithms for Embeddings and Regression Problems (talk from Simons Big Data Workshop I) (pdf)

  • Randomized Regression in Parallel and Distributed Environments (talk from GraphLab 2013) (pdf)

  • Extracting insight from large networks: implications of small-scale and large-scale structure (pdf, ppt)

  • Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments (version from MMDS 2012) (pdf)

  • Sensors, networks, and massive data (pdf, ppt)

  • Randomized Algorithms for Matrices and Data (pdf, ppt)

  • Approximate computation and implicit regularization in large-scale data analysis (PODS vsn) (pdf, ppt)

  • Approximate computation and implicit regularization in large-scale data analysis (Stats vrsn1) (pdf, ppt)

  • Approximate computation and implicit regularization in large-scale data analysis (Short vrsn) (pdf, ppt)

  • Looking for clusters in your data ... in theory and in practice (pdf, ppt)

  • Fast Approximation of Matrix Coherence and Statistical Leverage (pdf, ppt)

  • Implementing regularization implicitly via approximate eigenvector computation (pdf, ppt)

  • Linear Algebra and Machine Learning of Large Informatics Graphs (pdf, ppt)

  • Geometric Network Analysis Tools (talk from MMDS 2010) (pdf, ppt)

  • Algorithmic and Statistical Perspectives on Large-Scale Data Analysis (pdf, ppt)

  • Community structure in large social and information networks (newer) (pdf, ppt)

  • Statistical leverage and improved matrix algorithms (newer and long) (pdf, ppt)

  • Approximation Algorithms as Experimental Probes of Informatics Graphs (pdf, ppt)

  • Community structure in large social and information networks (talk from MMDS 2008) (pdf, ppt)

  • Community structure in large social and information networks (older) (pdf, ppt)

  • Statistical leverage and improved matrix algorithms (older and short) (pdf, ppt)

  • Sampling algorithms and core-sets for Lp regression and applications (pdf, ppt)

  • CUR Matrix Decompositions for Improved Data Analysis (talk from MMDS 2006) (pdf, ppt)

  • A Relative-Error CUR Decomposition for Matrices and Its Data Applications (pdf, ppt)

  • Sampling Algorithms for L2 Regression and Applications (talk from SODA 2006) (pdf, ppt)

  • Approximating a Gram Matrix for Improved Kernel-Based Learning (talk from COLT 2005) (ps, pdf)

  • Fast Monte Carlo Algorithms for Matrix Operations and Massive Data Set Analysis (newer) (pdf, ppt)

  • Fast Monte Carlo Algorithms for Matrix Operations and Massive Data Set Analysis (older) (pdf)

  • CUR Matrix Decomposition with Applications to Algorithm Design and Massive Data Set Analysis (pdf)

  • Fast Monte Carlo Algorithms for Massive Data Sets and Approximating Max-Cut (ps, pdf)

The TIP5P Water talk:

  • The Computational Statistical Mechanics of Simple Models of Liquid Water (pdf)

Videos of Talks and Presentations