Stat260:
Randomized Linear Algebra, Optimization, and Large-Scale Learning

Email: mmahoney ATSYMBOL stat.berkeley.edu

Office hours: TBD.

Office is on the third floor of Evans Hall.

Class time and Location:

Tue-Thu 3:30-5:00pm, in 340 Evans Hall Hall, on the UC Berkeley campus. (First meeting is Tue Jan 21, 2025.)

Notes:

(1/9) Here is a the initial web page for the class. All students, including auditors, are requested to register for the class. Auditors should register S/U; an S grade will be awarded for class participation and satisfactory scribe notes.

Course description: Matrices are a popular way to model data (e.g., term-document data, people-SNP data, social network data, machine learning kernels, neural network weight matrices, and so on). However, the size-scale, noise properties, and diversity of modern data present serious challenges for many traditional deterministic matrix algorithms. In recent years, Randomized Numerical Linear Algebra (RandNLA) has emerged as area that exploits randomness for improved matrix algorithms. The course will cover the theory and practice of RandNLA-based algorithms for large-scale matrix problems arising in modern machine learning, data science, and scientific computing. Topics to be covered include: underlying theory, including the Johnson-Lindenstrauss lemma, random sampling and projection algorithms, and connections between representative problems such as matrix multiplication, least-squares regression and linear solvers, low-rank matrix approximation, eigenvector computation, etc.; uses in sketch-and-solve, sketch-and-precondition, and iterative sketching paradigms; uses in scalable first-order and second-order stochastic optimization algorithms; numerical and computational issues that arise in practice in implementing algorithms in different computational environments and at different levels of precision; machine learning and statistical issues; and extensions/connections to related problems as well as recent work that builds on the basic methods. Appropriate for graduate students in computer science, statistics, and mathematics, as well as computationally-inclined students from application domains.

Prerequisites: General mathematical sophistication; and a solid understanding of Algorithms, Linear Algebra, and Probability Theory, at the advanced undergraduate or beginning graduate level, or equivalent. (If you have not at least seen most of the material in Chapter 2 and 3 of these notes, then you may have difficulties.)

Course requirements: Most likely, multiple homeworks (ca. $60\%$ total), scribe a lecture (ca. $10\%$), and a research project (ca. $40\%$); and perhaps a final exam.

Want more information on RandNLA? DJ Rich of Mutual_Information posted a very nice summary video about RandNLA, motivated by but much more general than our recent RandBLAS/RandLAPACK monograph. Check it out if you want to learn more.

Primary references:

Much of the material has not worked its way into textbooks, and thus we will be reading reviews and primary sources. There are, however, multiple good reviews on the topic. Here are a few articles that should give you an idea of some of the topics.

General introductory overviews:

"RandNLA: Randomized Numerical Linear Algebra," Drineas and Mahoney (CACM 2016) (pdf)

"Randomized algorithms for matrices and data" Mahoney (FnT 2011) (arxiv)

"Determinantal Point Processes in Randomized Numerical Linear Algebra" Derezinski and Mahoney (NoticesAMS 2020) (arxiv)

Primary material (for the first half of the term):

LN-RLA: "Lecture Notes on Randomized Linear Algebra," Mahoney (unppub 2013) (arxiv)

"Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning," Derezinski and Mahoney (KDD 2024) (arxiv)

"Lectures on Randomized Numerical Linear Algebra," Drineas and Mahoney (PCMI 2017) (arxiv)

Theoretical computer science perspective:

"Sketching as a tool for numerical linear algebra," Woodruff (FnT 2014) (arxiv)

"Randomized algorithms in numerical linear algebra," Kannan and Vempala (Acta 2017) (pdf)

Low-rank approximation:

"Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions," Halko, Martinsson, and Tropp (SIREV 2011) (arxiv)

Scientific computing perspective:

"Randomized methods for matrix computations," Martinsson (PCMI 2016) (arxiv)

"Randomized numerical linear algebra: Foundations and algorithms," Martinsson and Tropp (Acta 2020) (arxiv)

"Randomized matrix computations: Themes and variations," Kireeva and Tropp (unpub 2024) (arxiv)

Software perspective:

"Randomized Numerical Linear Algebra: A Perspective on the Field With an Eye to Software" Murray, Demmel, Mahoney, Erichson, Melnichenko, Malik, Grigori, Luszczek, Derezinski, Lopes, Liang, Luo, and Dongarra (SIAM 2023) (arxiv)

Optimization:

"Optimization Methods for Large-Scale Machine Learning," Bottou, Curtis, and Nocedal (SIREV 2016) (arxiv)

Additional material:

TBD.

Lectures:

Tue 1/21/25: Course Information, Introduction, and Overview

Main References:
Chapter 1 of LN-RLA

Thu 1/23/25: Approximating Matrix Multiplication

Main References:
Section 4 of: "Lectures on Randomized Numerical Linear Algebra," Drineas and Mahoney (PCMI 2017) (arxiv)

Backup:
Chapter 2 of LN-RLA
Drineas, Kannan, and Mahoney, "Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication"

Tue 1/28/25: Scalar and Matrix Concentration

Main References:
Chapter 3-5 of LN-RLA

Backup:
Section 4 of: "Lectures on Randomized Numerical Linear Algebra," Drineas and Mahoney (PCMI 2017) (arxiv)

Thu 1/30/25: Scalar and Matrix Concentration, cont.

Main References:
Same as last class. We will cover Thm 6, Thm 8, and (hopefully) Lemma 15 of Chapter 5 of LN-RLA.

Tue 2/04/25: Johnson-Lindenstrauss lemma

Main References:
Chapter 5 of LN-RLA

Thu 2/06/25: Least Squares Regression

Main References:
Section 5 of: "Lectures on Randomized Numerical Linear Algebra," Drineas and Mahoney (PCMI 2017) (arxiv)

Backup:
Chapter 6-7 of LN-RLA

Tue 2/11/25: RandBLAS/RandLAPACK

Main References:
Chapter 1-2 of: "Randomized Numerical Linear Algebra: A Perspective on the Field With an Eye to Software" Murray et al. (SIAM 2023) (arxiv)

Thu 2/13/25: Fast (Hadamard-based) Least Squares Regression

Main References:
Chapter 8-10 of LN-RLA

Tue 2/18/25: Practical RandNLA Least Squares Solvers

Main References:
Chapter 11-13 of LN-RLA

Thu 2/20/25: Practical RandNLA Least Squares Solvers, Cont.

Main References:
Same as last class.

Tue 2/25/25: Low-rank Matrix Approximation

Main References:
Chapter 14-15 of LN-RLA

Thu 2/27/25: Low-rank Matrix Approximation, cont.

Main References:
Same as last class.

Tue 3/04/25: Low-rank Matrix Approximation, cont.

Main References:
Chapter 16-17 of LN-RLA
"Structural properties underlying high-quality Randomized Numerical Linear Algebra algorithms," Drineas and Mahoney (pdf)

Thu 2/06/25: Low-rank Matrix Approximation, cont.

Main References:
Same as last class.

Tue 3/11/25: Low-rank Matrix Approximation, cont.

Main References:
Chapter 18-19 of LN-RLA
"Randomized methods for matrix computations," Martinsson (arxiv)

Thurs 3/13/25: Low-rank Matrix Approximation, cont.

Main References:
Same as last class.

Tue 3/18/25: Bias and Variance of Least-squares and Least-norm Solutions via Randomized Methods,

Main References:
Pilanci's class notes (pdf and pdf)

Thurs 3/20/25: Bias and Variance of Least-squares and Least-norm Solutions via Randomized Methods, cont.

Main References:
Same as last class.

Tue 4/01/25: Inversion Bias and Algorithmic Gaussianization,

Main References:
"Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning," Derezinski and Mahoney (KDD 2024) (arxiv)

Thu 4/03/25: Very Sparse Sketches: JL and Near Unbiasedness

Main References:
Chapter 2.4 of: "Randomized Numerical Linear Algebra: A Perspective on the Field With an Eye to Software" Murray et al. (SIAM 2023) (arxiv)
"Sparse sketches with small inversion bias," Derezinski et al. (arxiv)
Section 3 and Appendix A.1 of "Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression," Meng and Mahoney, (arxiv)

Tue 4/08/25: Random Matrix Theory,

Main References:
RMT4MML, Liao and Mahoney (posted on gradescope)

Thurs 4/10/25: Random Matrix Theory, cont.

Main References:
Same as last class.

Tue 4/15/25: Random Matrix Theory, cont.

Main References:
Same as last class.

Thurs 4/17/25: Random Matrix Theory, cont.

Main References:
Same as last class.

Tue 4/22/25: RandNLA for Stochastic Optimization and Statistics

Main References:
Section 4 and 5 of: "Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning," Derezinski and Mahoney (KDD 2024) (arxiv)

Additional References:
"Optimization Methods for Large-Scale Machine Learning," Bottou, Curtis, and Nocedal (SIREV 2016) (arxiv)

Thurs 4/24/25: Gradient Descent, Momentum, and Newton

Main References:
Pilanci's class notes (pdf and pdf) (local pdf and local pdf)

Additional References:
"Optimization Methods for Large-Scale Machine Learning," Bottou, Curtis, and Nocedal (SIREV 2016) (arxiv)

Tue 4/29/25: Better Stochastic Optimization: Stochastic Newton, etc.

Main References:
"Newton Sketch: A Linear-time Optimization Algorithm with Linear-Quadratic Convergence," by Pilanci and Wainwright
"A randomized Kaczmarz algorithm with exponential convergence," by Strohmer and Vershynin
"Randomized Iterative Methods for Linear Systems," by Gower and Richtarik
"Precise expressions for random projections: Low-rank approximation and randomized Newton," by Derezinski, Liang, Liao, and Mahoney

Thurs 5/1/25: Better Stochastic Optimization: Randomized Kaczmarz, SGD, etc.

Main References:
Same as last class.

Stat260: Randomized Linear Algebra, Optimization, and Large-Scale Learning

Lectures:

Stat260:
Randomized Linear Algebra, Optimization, and Large-Scale Learning