Statistics and Data Science 242/542: Theory of Statistics

Zhou Fan, Yale University, Spring 2025


Description

Principles of statistical inference. Topics include hypothesis testing, parameter estimation, uncertainty quantification, prediction, Bayesian analysis, and simulation-based inference.

Prerequisites: Probability theory (S&DS 241/541) and multivariable calculus (Math 120). Some knowledge of computer programming, or willingness to learn!

Lectures

Mondays and Wednesdays
2:30 - 3:45PM
YSB Marsh Auditorium

Teaching staff and office hours

Instructor:
Zhou Fan, zhou.fan@yale.edu

Course manager:
Bella Bao, bella.bao@yale.edu

Teaching assistants:
Johanna Dammann, johanna.dammann@yale.edu
Xinyang Hu, xinyang.hu@yale.edu
Langchen Liu, langchen.liu@yale.edu
Linghai Liu, linghai.liu@yale.edu
Neil Mathew, neil.mathew@yale.edu
Selma Mazioud, selma.mazioud@yale.edu
Max Lovig, max.lovig@yale.edu
Matthew Ross, matthew.ross@yale.edu
Ivan Sinyavin, ivan.sinyavin@yale.edu
Arjun Verma, arjun.verma@yale.edu
Brian Xiang, brian.xiang@yale.edu
Grant Zhang, grant.zhang@yale.edu
Bronson Zhou, bronson.zhou@yale.edu

Office hours:
Sunday 2-3PM, WLH006 (Johanna)
Monday 12-1PM, KT1101 (Xinyang)
Monday 1-2PM, Zoom (Matthew)
Monday 5-6PM, KT1208 (Zhou)
Monday 7:30-8:30PM, HQ133 (Neil)
Tuesday 10-11AM, KT1101 (Brian)
Tuesday 12-1PM, KT1208 (Max)
Tuesday 3-4PM, KT1105B (Linghai)
Tuesday 4-5PM, KT1105B (Ivan)
Tuesday 7-8PM, HQ133 (Arjun)
Thursday 4-5PM, KT1105B (Langchen)
Friday 11AM-12PM, KT209 (Grant)
Friday 12-1PM, Zoom (Bronson)
Saturday 12-1PM, Zoom (Selma)

Requirements and policies

Homework

Approximately weekly, due Wednesdays 1pm on Gradescope. You may use a total of 8 late days over the semester without penalty, with at most 4 late days for a single assignment. Additional late assignments will incur a 20% penalty per day it is late. Assignments more than 4 days late will not be accepted. Please indicate at the top of your assignment the number of late days used.

Homework assignments will include computing exercises asking you to perform small simulations, create histograms and plots, and analyze data. Guidance will be provided in the programming language R, although you may choose to use any other language (e.g. Python, Julia, Matlab). You will be graded on your results, not on the quality of your code.

Collaboration and AI use

You are encouraged to discuss homework problems with your classmates, but you must submit your own individual homework write-up, using your own code for the programming exercises. Please indicate at the top of your submission the names of your collaborators.

Use of generative AI tools (e.g. ChatGPT, Claude, Gemini, Llama) is not permitted, unless otherwise noted on the homework assignment.

Exams

Midterm: Monday Feb 24, 7PM
Final: Tuesday May 6, 9AM

Grading

Your final grade will be the maximum of the following two weightings:
30% x (average homework) + 35% x (midterm exam) + 35% x (final exam)
30% x (average homework) + 20% x (midterm exam) + 50% x (final exam)

Textbooks

John A. Rice, Mathematical Statistics and Data Analysis, 3rd edition.

Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An Introduction to Statistical Learning (with Applications in R).