STAT154: Introduction to Statistical Learning

Song Mei, University of California, Berkeley, Fall 2021

Description

Instructor: Song Mei (songmei [at] berkeley.edu)
Lectures: Tuesday/Thursday 17:00-18:30. Moffitt Library 101.
Office Hours: Tuesday 13:30 - 15:00. Evans 387 (and on Zoom).
GSI: Ryan Theisen (theisen [at] berkeley.edu)
Lab sessions Monday 10am - 12pm, Moffitt Library 103; 2 pm - 4 pm, Wheeler 220.
Office Hours: Friday 10-11:30 (On Zoom).

This course will focus on statistical/machine learning methods, data analysis/programming skills. Upon completing this course, the students are expected to be able to 1) build baseline models for real world data analysis problems; 2) implement models using programming languages; 3) draw insights/conclusions from models.

Announcements

  • HW policy: There are in total three late days that you can use without penalty towards grade throughout the semester. After that, there will be a 10% deduction on grades of a HW for each late day. The least grade can be dropped counting towards total grades.

  • All the course related information can be found at this website (instead of bCourse). If you have questions, please use Piazza.

  • The lectures and lab sessions are recorded using computer cameras (the quality may not be guaranteed and some recordings might be missing due to technical reasons). You can find them here Recordings. You need a Berkeley account to get access to this folder.

Prerequisite

  • MATH 53 and 54 or equivalents; MATH 110 is highly recommended. STAT 135 or equivalent (DATA/STAT C100 and (STAT 134 or DATA/STAT C140) will be accepted). STAT 133 preferred. STAT 151A is recommended. Scripting language required and R experience recommended.

  • Review of Matrix Algebra and Calculus, Basic Probability (Adopted from CS229 at Stanford).

Grading

  • Class attendance is required.

  • Homework per two weeks. There will be 6-7 HWs.

  • In class mid-term. Date TBA.

  • Final exam date: Dec 16, 11:30 am - 2:30 pm.

  • Final grade will be Homework × 50 % + mid-term × 20 % + final × 30 %.

  • HW policy: There are in total three late days that you can use without penalty towards grade throughout the semester. After that, there will be a 10% deduction on grades of a HW for each late day. The least grade can be dropped counting towards total grades.

Topics

Basic topics:

  • Tasks: Regression. Classification. Dimension reduction. Clustering.

  • Algorithms: Solving linear systems. Gradient descent. Newton’s method. Power iteration for eigenvalue problems. EM algorithms.

  • Others: Kernel methods. Regularization. Sample splitting. Resampling methods. Cross validation.

Advanced topics:

  • Statistical learning theory and optimization theory.

  • Bagging and Boosting. Tree based models. Neural networks. Bayesian models.

  • Online learning. Bandit problems.