Michael Mahoney - Research

Overview

The main focus of my work is on algorithmic and statistical aspects of modern large-scale data analysis. There is a focus on foundational/theoretical questions, but this theory is strongly tethered to implementational questions and a diverse range of very practical applications.

Theory:

  • Randomized linear algebra and randomized numerical linear algebra.

  • Stochastic optimization for convex and non-convex problems.

  • Local graph partitioning and approximation algorithms.

Implementations:

  • of a range of core linear algebra, graph, and stochastic optimization algorithms

  • on single machine, distributed data system, and supercomputer environments

  • including the "RandBLAS" and "RandLAPACK" libraries, as described in The RandLAPACK book.

Applications:

  • Genetics, medical imaging, astronomy/astrophysics, climate science, fluid flow, and other scientific applications.

  • Internet and social media analysis.

  • Computer vision and natural language processing.

My dissertation was in computational statistical mechanics (the centerpiece was the development and analysis of the TIP5P model of liquid water). Prior to graduate school I worked in both computational and experimental biophysics on proteins and protein-nucleic acid interactions. After graduate school, I switched to theoretical computer science, where I did a lot of work on randomized algorithms for large matrix and graph problems. I also worked at Yahoo Research for several years, where I worked on large-scale web analytics, query log analysis, social media analysis, and social network analysis.

Funding

Many thanks for the following (dated) list of those providing funding. (This is a little out of date.)
  • NSF Research Grant (with P. Drineas, M. Gu, and I. Ipsen), "Randomization as a Resource for Rapid Prototyping," 2018-2021, $450K.

  • NSF Research Grant, "Combining Stochastics and Numerics for Improved Scalable Matrix Computations," 2018-2021, $500K.

  • ONR Research Grant (with A. Shrivastava and R. Baraniuk), "Randomized Numerical Linear Algebra for Large-scale Learning and Inference," 2018-2022, $400K.

  • NSF Research Grant (with B. Yu, F. Perez, R. Karp, and M. Jordan), "Berkeley Institute on Foundations of Data Analysis," 2017-2020, $1.5M.

  • NSF Research Grant (with K. Ramchandran and S. Avestimehr), "Foundations of Coding for Modern Distributed Computing," 2017-2021, $350K.

  • DOE Research Grant, "Scalable Inference for Adversarial Network Data," 2016-2018, $90K.

  • DARPA Research Grant, D3M program, "Robust, Efficient, and Local Machine Learning Primitives," 2017-2021, $1.35M.

  • Academic Research Gift: Adobe, Inc., "Terabyte-scale Regression Diagnostic Methods for Interactive and Exploratory Analytics," 2016-2018, $50K.

  • ARO Research Grant, "Local Algorithms for Large Informatics Graphs," 2016-2019, $375K.

  • Cray Research Grant, "Implementing and Evaluating Matrix Algorithms in Spark on High Performance Computing Platforms for Science Applications," 2015-2019, $1.0M.

  • UCB Internal Research Grant, via BDD, "Improving the scaling of deep learning networks by characterizing and exploiting soft convexity," 2016-2019, $200K.

  • NSF Research Grant, via Purdue CSoI, (with D. Gleich) "Quantifying the information content of a graph via information in graph diffusions," 2015-2019, $325K.

  • NSF Travel Grant, "Streaming Algorithms for Fundamental Computations in Numerical Linear Algebra," (with J. Demmel, O. Schwartz, and S. Toldeo) 2015-2019, $40K.