Randomized Algorithms for Matrices and Data

Instructor: Michael Mahoney
  • Email: mmahoney ATSYMBOL
  • Office hours: By appointment.
  • Office is on the third floor of Calvin Hall.

    Teaching Assistant: Yuchen Zhang
  • Email: yuczhang ATSYMBOL
  • Office hours: Fri 5:00-6:30pm, SODA aclove 411

    Class time and Location:
  • Mon-Wed 5:00-6:30pm, in 3109 Etcheverry Hall, on the UC Berkeley campus. (First meeting is Wed Sept 4, 2013.)

  • (12/31) My scribed version of the class lectures are available below. Feedback welcome!
  • (11/15) Final project presentations will be in class on Dec 11th.
  • (11/15) There will be no class Dec 9th due to the NIPS workshops many people will be attending.
  • (11/1) Most of you have already coordianted with me regarding the final project. If you haven't done so already, then you should do so ASAP.
  • (10/16) The second homework may be found here; it is due 11/13/13.
  • (10/13) The description for the final project is posted here. I don't have regular office hours, but I do have them by appointment, so let me know if you want to meet as you formulate the project.
  • (9/23) Planning ahead, we will not be having class on the Wednesday immediately prior to Thanksgiving.
  • (9/23) We have starting posting the scribed notes below. In general, we will get them up about a week after the class. They have not been thoroughly checked for completeness/correctness, so use them as a starting point to complement the readings.
  • (9/20) The list to sign up to scribe is here. For those who do not sign up by next week, we will assign the remaining students to the remaining classes randomly.
  • (9/17) The first homework may be found here; it is due 10/09/13.
  • (9/11) Please sign up here for a class to scribe, and coordinate with Yuchen as necessary. Depending on the numbers, taking the class for a grade or P/F, etc., everyone will have to scribe one or two classes.
  • (9/9) Starting on Wed 9/11, the class will meet in 3109 Etcheverry Hall.
  • (9/9) A template in tex for scribing the lectures can be found here, and what it should look like when it is complied can be seen here.
  • (8/30) The class is oversubscribed. If you decide to drop the class, please un-register immediately so that another student can be admitted. Some additional spaces will be made available, but if the class remains full, it may be necessary to limit enrollment.
  • (8/30) All students, including auditors, are requested to register for the class. Auditors should register S/U; an S grade will be awarded for class participation and satisfactory scribe notes.
  • (8/15) This class will be taught at UC Berkeley, not Stanford, during Fall 2013. I will be at Berkeley this fall as part of the program on "Theoretical Foundations of Big Data Analysis," to be held at the Simons Institute.

    Course description: Matrices are a popular way to model data (e.g., term-document data, people-SNP data, social network data, machine learning kernels, and so on), but the size-scale, noise properties, and diversity of modern data presents serious challenges for many traditional deterministic matrix algorithms. The course will cover the theory and practice of randomized algorithms for large-scale matrix problems arising in modern massive data set analysis (i.e., Randomized Numerical Linear Algebra). Topics to be covered include: underlying theory, including the Johnson-Lindenstrauss lemma, random sampling and projection algorithms, and connections between representative problems such as matrix multiplication, least-squares regression, least-absolute deviations regression, low-rank matrix approximation, etc.; numerical and computational issues that arise in practice in implementing algorithms in different computational environments; machine learning and statistical issues, as they arise in modern large-scale data applications; and extensions/connections to related problems as well as recent work that builds on the basic methods. Appropriate for advanced graduate students in computer science, statistics, and mathematics, as well as computationally-inclined students from application domains.

    Prerequisites: General mathematical sophistication; and a solid understanding of Algorithms, Linear Algebra, and Probability Theory, at the advanced undergraduate or beginning graduate level, or equivalent.

    Course requirements: Most likely, three homeworks (ca. 15-20% each), scribe a lecture (ca. 10%), and a major project (ca. 40%).

    Primary references: Much of the material has not worked its way into textbooks, and thus we will be reading reviews and primary sources. Here are a few articles that should give you an idea of some of the topics. Additional articles for particular topics and particular classes are listed below.


    (My scribed versions of the lectures are included below.)