Statistics 210A: Theoretical Statistics (Fall 2023)

The Fall 2024 web site for this course is at https://stat210a.berkeley.edu/fall-2024/

If you are an undergraduate who wants to take the course, please fill out the permission code request form to let me know about your background.

Anyone considering taking the course is encouraged to read the frequently asked questions regarding preparation and review materials.

Course content

This is an introductory Ph.D.-level course in theoretical statistics. It is a fast-paced and demanding course intended to prepare students for research careers in statistics.

Statistics is the study of methods that use data to understand the world. Statistical methods are used throughout the natural and social sciences, in machine learning and artificial intelligence, and in engineering. Despite the ubiquitous use of statistics, its practitioners are perpetually accused of not actually understanding what they are doing. Statistics theory is, broadly speaking, about trying to understand what we are doing when we use statistical methods. See the course introduction for a more detailed explanation as well as comparisons to other Berkeley courses like Stat 215A and B, Stat 210B, and CS 281A/Stat 241A (Statistical Learning Theory).

Topics include: Statistical decision theory (frequentist and Bayesian), exponential families, point estimation, hypothesis testing, resampling methods, estimating equations and maximum likelihood, empirical Bayes, large-sample theory, high-dimensional testing, multiple testing and selective inference.

Course Information

Prof. Will Fithian (Instructor)
- 301 Evans
- Office Hours: T 3:30-4:30 on Zoom, Th 9:30-10:30am in Evans 301
- Email wfithian@berkeley.edu
Taejoo Ahn (GSI)
- Office Hours: W 9-10am on Zoom, F 12-1pm in Evans 444
- Email
Course schedule
- Lectures TuTh 11-12:30, Evans 60
- Recitation sections every second F 11am-12pm in Evans 344, beginning September 1
- Final Exam Review TBD
- Final Exam Wed December 13, 8-11am
Lecture videos and homework solutions at bCourses
Email policy: You can email me or the GSIs about administrative questions, with “[Stat 210A]” in the subject line. No math over email, please.
Ed for announcements and technical discussion (no homework spoilers!)
Gradescope for turning in homework

Materials

Handwritten lecture notes (Fall 2023):

Typed lecture notes with additional detail (Fall 2023):

Materials from class:

Assignments:

Not Due Aug 30: Optional Homework 0 (source)
Due Sep 6: Homework 1 (source)
Due Sep 13: Homework 2 (source)
Due Sep 20: Homework 3 (source)
Due Sep 27: Homework 4 (source)
Due Oct 4: Homework 5 (data for Problem 5) (source)
Due Oct 11: Homework 6 (source)
Due Thursday Oct 19: Homework 7 (source)
Due Oct 25: Homework 8 (source)
Due Nov 2: Homework 9 (source)
Due Nov 8: Homework 10 (source) (data for Problem 4)
Due Nov 15: Homework 11 (source)
Optional: Homework 12 (source)

References

All texts are available online from Springer Link.

Main text:

Keener, Theoretical Statistics: Topics for a Core Course, Springer 2010.

Supplementary texts:

Lehmann and Casella, Theory of Point Estimation, Springer 1998.
Lehmann and Romano, Testing Statistical Hypotheses, Springer 2005.
Candes, Stats 300C Lecture notes, Stanford 2016.
Unofficial Fall 2017 lecture notes (transcribed by student Sinho Chewi).
Gelman & Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models (for applied context).

Undergrad-level review texts for prerequisites:

Axler, Linear Algebra Done Right, Chapters 1-3, 5-6.
Abbott, Understanding Analysis, Chapters 1-3.
Adhikari & Pitman, Probability for Data Science, Chapters 1-6, 8-9, 13-17, and 23.

Grading

Your final grade is based on:

Weekly problem sets: 50%
Final exam: 50%

Lateness policy: Homework must be submitted to Gradescope at midnight on Wednesday nights. Late problem sets will not be accepted, but we will drop your lowest two grades.

Collaboration policy: For homework, you are welcome to work with each other or consult articles or textbooks online, with the following caveats:

You must write up your solution by yourself.
You may NOT consult any solutions from previous iterations of this course.
If you collaborate or use any resources other than course texts, you must acknowledge your collaborators and the resources you used.

Academic integrity: You are expected to abide by the Berkeley honor code. Violating the collaboration policy, or cheating in any other way, will result in a failing grade for the semester and you will be reported to the University Office of Student Conduct.

Accommodations

Students with disabilities: Please see me as soon as possible if you need particular accommodations, and we will work out the necessary arrangements.

Scheduling conflicts: Please notify me in writing by the second week of the term about any known or potential extracurricular conflicts (such as religious observances, graduate or medical school interviews, or team activities). I will try my best to help you with making accommodations, but cannot promise them in all cases. In the event there is no mutually-workable solution, you may be dropped from the class.

Lecture schedule

Date	Reading	Topic
Aug. 26	Chap. 1 and Sec. 3.1 of Keener	Probability models and risk
Aug. 31	Chap. 2 of Keener	Exponential families
Sep. 2	Chap. 2 and Sec. 3.2 of Keener	Sufficient statistics
Sep. 7	Secs. 3.4, 3.5, and 3.6 of Keener	Minimal sufficiency and completeness
Sep. 9	Secs. 3.6 and 4.1 of Keener	Rao-Blackwell theorem
Sep. 14	Secs. 4.1 and 4.2 of Keener	UMVU estimation
Sep. 16	Secs. 4.5 and 4.6 of Keener	Information inequality
Sep. 21	Secs. 7.1 and 7.2 of Keener	Bayesian estimation
Sep. 23	Secs. 7.1 and 7.2 of Keener	Conjugate priors
Sep. 28	Secs. 7.2 and 11.1 of Keener	More on Bayes
Sep. 30	Secs. 7.2 and 11.1 of Keener	Hierarchical priors, empirical Bayes
Oct. 5	Secs. 11.1, 11.2 and 9.4 of Keener	James-Stein paradox, confidence intervals
Oct. 7	Secs. 5.1 and 5.2 of Lehmann-Casella	Minimaxity and admissibility
Oct. 12	Secs. 12.1, 12.2, 12.3 and 12.4 of Keener	Hypothesis testing, Neyman-Pearson lemma
Oct. 14	Secs. 12.3, 12.4, 12.5, 12.6 and 12.7 of Keener	UMP tests
Oct. 19	Secs. 13.1, 13.2, and 13.3 of Keener	Testing with nuisance parameters
Oct. 21	Secs. 13.1, 13.2, and 13.3 of Keener	UMP unbiased tests
Oct. 26	Secs. 13.1, 13.2, and 13.3 of Keener	UMP unbiased tests
Oct. 28	Secs. 14.1, 14.2, 14.4, 14.5, and 14.7 of Keener	Linear models
Nov. 2	Secs. 8.1, 8.2, and 8.3 of Keener	Asymptotic concepts
Nov. 4	Secs. 8.3 and 8.4 of Keener	Maximum likelihood estimation
Nov. 9	Secs. 8.5, 9.1, and 9.2 of Keener	Relative efficiency
Nov. 11	Secs. 9.1, 9.2, and 9.3 of Keener	Consistency of the MLE
Nov. 16	Secs. 9.1, 9.2, and 9.3 of Keener	Asymptotic normality of MLE
Nov. 18	Secs. 9.5 and 9.7 of Keener	Trio of asymptotic likelihood-based tests and CIs
Nov. 23	Secs. 19.1-19.3 of Keener	Bootstrap and permutation tests
Nov. 25		No class (Thanksgiving)
Nov. 30	15.1-15.4 of Lehmann-Romano	Bootstrap theory
Dec. 2	Online notes	Multiple testing
Dec. 7	Online notes	Causal inference
Dec. 9	Online notes	Causal inference