Michael Mahoney - Research

Overview

The main focus of my work is on algorithmic and statistical aspects of modern large-scale data analysis. There is a focus on foundational/theoretical questions, but this theory is strongly tethered to implementational questions and a diverse range of very practical applications.

Theory:

  • Randomized linear algebra and randomized numerical linear algebra.

  • Stochastic optimization for convex and non-convex problems.

  • Implicit regularization for scalable approximation/optimization algorithms.

  • Local graph partitioning and approximation algorithms.

Implementations:

  • of a range of core linear algebra, graph, and stochastic optimization algorithms

  • on single machine, distributed data system, and supercomputer environments

Applications:

  • Internet and social media analysis.

  • Community structure, clustering, and information dynamics in large social and information networks.

  • Genetics, medical imaging, astronomy/astrophysics, climate science, and a range of other scientific applications.

My dissertation was in computational statistical mechanics (the centerpiece was the development and analysis of the TIP5P model of liquid water). Prior to graduate school I worked in both computational and experimental biophysics on proteins and protein-nucleic acid interactions. After graduate school, I switched to theoretical computer science, where I did a lot of work on randomized algorithms for large matrix and graph problems. I also worked at Yahoo Research for several years, where I worked on large-scale web analytics, query log analysis, social media analysis, and social network analysis.

Software

See the full publication list for code to reproduce results on any one paper.

Funding

Many thanks to those currently providing funding.
  • NSF Research Grant (with P. Drineas, M. Gu, and I. Ipsen), "Randomization as a Resource for Rapid Prototyping," 2018-2021, $450K.

  • NSF Research Grant, "Combining Stochastics and Numerics for Improved Scalable Matrix Computations," 2018-2021, $500K.

  • ONR Research Grant (with A. Shrivastava and R. Baraniuk), "Randomized Numerical Linear Algebra for Large-scale Learning and Inference," 2018-2022, $400K.

  • NSF Research Grant (with B. Yu, F. Perez, R. Karp, and M. Jordan), "Berkeley Institute on Foundations of Data Analysis," 2017-2020, $1.5M.

  • NSF Research Grant (with K. Ramchandran and S. Avestimehr), "Foundations of Coding for Modern Distributed Computing," 2017-2021, $350K.

  • DOE Research Grant, "Scalable Inference for Adversarial Network Data," 2016-2018, $90K.

  • DARPA Research Grant, D3M program, "Robust, Efficient, and Local Machine Learning Primitives," 2017-2021, $1.35M.

  • Academic Research Gift: Adobe, Inc., "Terabyte-scale Regression Diagnostic Methods for Interactive and Exploratory Analytics," 2016-2018, $50K.

  • ARO Research Grant, "Local Algorithms for Large Informatics Graphs," 2016-2019, $375K.

  • Cray Research Grant, "Implementing and Evaluating Matrix Algorithms in Spark on High Performance Computing Platforms for Science Applications," 2015-2019, $1.0M.

  • UCB Internal Research Grant, via BDD, "Improving the scaling of deep learning networks by characterizing and exploiting soft convexity," 2016-2019, $200K.

  • NSF Research Grant, via Purdue CSoI, (with D. Gleich) "Quantifying the information content of a graph via information in graph diffusions," 2015-2019, $325K.

  • NSF Travel Grant, "Streaming Algorithms for Fundamental Computations in Numerical Linear Algebra," (with J. Demmel, O. Schwartz, and S. Toldeo) 2015-2019, $40K.