STAT154/254: Introduction to Statistical Learning

Song Mei, University of California, Berkeley, Spring 2024

Description

Instructor: Song Mei (songmei [at] berkeley.edu)
Lectures: Tuesday/Thursday 11:00-12:30. Social Sciences Building 20.
Office Hours: TBA. Evans 387.
GSI: Adam Quinn Jaffe (aqjaffe [at] berkeley.edu)
Lab sessions Monday 12pm - 2pm, Evans 334; 3 pm - 5 pm, Evans 334.
Office Hours: TBA.

This course will focus on statistical/machine learning methods, data analysis/programming skills. Upon completing this course, the students are expected to be able to 1) build baseline models for real world data analysis problems; 2) implement models using programming languages; 3) draw insights/conclusions from models.

Announcements

  • First lecture starts on Jan 16, 2024 (Tuesday).

  • For students who filled in the enrollment appeal forms and concurrent enrollment students: I will process all the petitions on Jan 19. Please come to the first few lectures and decide whether you will take this course.

  • We will use Ed for discussions and questions.

  • Please find homework and lecture notes on bCourse under “Files”.

  • HW policy: There are in total three late days that you can use without penalty towards grade throughout the semester. After that, there will be a 10% deduction on grades of a HW for each late day. The least grade can be dropped counting towards total grades.

  • The lectures will be recorded through Course Capture. The recordings can be found on bCourse under “Media Gallery”.

Prerequisite

  • MATH 53 and 54 or equivalents; MATH 110 is highly recommended. STAT 135 or equivalent (DATA/STAT C100 and (STAT 134 or DATA/STAT C140) will be accepted). STAT 133 preferred. STAT 151A is recommended. Scripting language required and Python experience recommended.

  • Review of Matrix Algebra and Calculus, Basic Probability (Adopted from CS229 at Stanford).

Grading

  • Class attendance is required.

  • Homework per two weeks. There will be 6-7 HWs.

  • In class mid-term. Date: Mar 12.

  • Final exam date: May 9, 8 am - 11 am.

  • Final grade will be Homework × 40 % + mid-term × 25 % + final × 35 %.

  • HW policy: There are in total three late days that you can use without penalty towards grade throughout the semester. After that, there will be a 10% deduction on grades of a HW for each late day. The least grade can be dropped counting towards total grades.

Topics

Basic topics:

  • Tasks: Regression. Classification. Dimension reduction. Clustering.

  • Algorithms: Solving linear systems. Gradient descent. Newton’s method. Power iteration for eigenvalue problems. EM algorithms.

  • Others: Kernel methods. Regularization. Sample splitting. Resampling methods. Cross validation.

Advanced topics:

  • Statistical learning theory and optimization theory.

  • Bagging and Boosting. Tree based models. Neural networks. Bayesian models.

  • Online learning. Bandit problems.

Assignments

  • Please find homeworks in bCourse under “Files”. Please submit homework by the end of the due date (pacific time) on Gradescope (you could find the course link through bCourse). You should be able to handle more than 60 % of HW1 to satisfy the prerequisite.

  • The homework and exam questions for students in courses 154 and 254 are different.

Homework 1 Due on -
Homework 2 Due on -
Homework 3 Due on -
Homework 4 Due on -
Homework 5 Due on -
Homework 6 Due on -
Homework 7 Due on -

Schedule

Jan 16 (T) First lecture
Jan 18 (Th)
Jan 23 (T)
Jan 25 (Th)
Jan 30 (T)
Feb 1 (Th)
Feb 6 (T)
Feb 8 (Th)
Feb 13 (T)
Feb 15 (Th)
Feb 20 (T)
Feb 22 (Th)
Feb 27 (T)
Feb 29 (Th)
Mar 5 (T)
Mar 7 (Th)
Mar 12 (T) In class midterm
Mar 14 (Th)
Mar 19 (T)
Mar 21 (Th)
Mar 26 (T) Spring Recess
Mar 28 (Th) Spring Recess
Apr 2 (T) Travel
Apr 4 (Th)
Apr 9 (T)
Apr 11 (Th)
Apr 16 (T)
Apr 18 (Th)
Apr 23 (T)
Apr 25 (Th) Last lecture
Apr 30 (T) Review
May 2 (Th) Review
May 7 (T) -
May 9 (Th) Final exam