Department of Statistics and Data Science
Course List -- Subject to Changes! for Fall 2023/Spring 2024

Revised: 14 August 2023
Courses whose numbers end with a are offered in the FALL. Courses whose numbers end with b are offered in the SPRING.
Courses whose numbers end with ab are offered both semesters. Courses with a gray background are not taught this year.

CourseNumberInstructorTimeRoom
Introduction to Statistics S&DS 101-109/501-509 Jonathan Reuning-Scherer and Staff Tues, Thurs 1:00-2:15  
YData: Data Science for Political Campaigns S&DS 172b/572b PLSC347b/524b Joshua Kalla Wed 1:30-3:20  
YData: Foundations of Sociogenomics S&DS 178/578 SOCY 362 Ramina Sotoudeh Wed 1:00-3:20  
Data Exploration and Analysis S&DS 230/530 PLSC 530 Ethan Meyers Tues, Thurs 9:00-10:15  
Probability and Bayesian Statistics S&DS 238/538 Joe Chang Tues, Thurs 1:00-2:15  
Probability for Data Science S&DS 240/540 Bob Wooster Mon, Wed 2:30-3:45  
Probability Theory with Applications S&DS 241/541 MATH 241 Yihong Wu Mon, Wed 9:00-10:15  
Introductory Machine Learning S&DS 265/565 John Lafferty Tues, Thurs 11:30-12:50  
Neural Data Analysis S&DS 280/580 Ethan Meyers Tues, Thurs 2:30 - 3:45  
Linear Models S&DS 312/612 Zongming Ma Mon, Wed 11:35-12:50  
Intermediate Machine Learning S&DS 365/665 John Lafferty Mon, Wed 1:00 - 2:15  
Advanced Probability S&DS 400/600 MATH 330 Sekhar Tatikonda Tues, Thurs 2:30-3:45  
Statistical Inference S&DS 410/610 Harrison Zhou Tues, Thurs 11:35-12:50  
Statistical Case Studies S&DS 425 Brian MacDonald Mon, Wed 1:00 - 2:15  
Senior Project S&DS 491 Brian MacDonald -  
Theory of Statistics S&DS 542 Andrew Barron Tues, Thurs 9:00-10:15?  
Causal Inference and Research Design S&DS 616 PLSC 508 P Aronow Thurs 4:00-5:50  
Statistical Case Studies S&DS 625 Brian MacDonald Mon, Wed 2:30-3:45  
Computation and Optimization S&DS 431/631 Zhuoran Yang Tues, Thurs 1:00-2:15  
Computational and Statistical Trade-offs in High Dimensional Statistics S&DS 688 Ilias Zadik Tues 4:00-5:50  
Scientific Machine Learning S&DS 689 Lu Lu Tues, Thurs 4:00-5:15  
Indep Study S&DS 480ab Staff -  
Practical Work S&DS 626ab DGS -  
Statistical Consulting S&DS 627a/628b Jay Emerson Fri 2:30-4:30  
Independent Study or Topics Course S&DS 690ab DGS -  
Departmental Seminar S&DS 700ab - Mon 4:00-5:30  
Introductory Statistics S&DS 100b/500b Ethan Meyers Tues, Thurs 9:00-10:15  
YData S&DS 123b Ethan Meyers Tues, Thurs 2:30-3:45  
YData: Data Science Applications in Insurance S&DS 179 Perry Beaumont Tues, Thurs 9:00-10:15  
Intensive Introductory Statistics and Data Science S&DS 220b/520b Bob Wooster Tues, Thurs 9:00-10:15  
Data Exploration and Analysis S&DS 230b/530b PLSC 530b Jonathan Reuning-Scherer Tues, Thurs 9:00-10:15  
Theory of Statistics S&DS 242b/542b Bob Wooster Mon, Wed 9:00-10:15  
Computational Tools for Data Science S&DS 262/562 Roy Lederman Mon, Wed 1:00-2:15  
Deep Learning, 265-565-level? S&DS 266/566 Lu Lu Tues, Thurs 4:00-5:15  
Stochastic Processes S&DS 351b/551b Ilias Zadik Mon, Wed 1:00-2:15  
Biomedical Data Science, Mining and Modeling S&DS 352/MCDB 452 Mark Gerstein and Matthew Simon Mon, Wed 1:00-2:15  
Data Analysis S&DS 361b/661b Brian MacDonald Tues, Thurs 9:00-10:15  
Multivariate Statistics for Social Sciences S&DS 363b/563b Jonathan Reuning-Scherer Tues, Thurs 1:00-2:15  
Information Theory S&DS 364b/664b Andrew Barron Tues, Thurs 11:35-12:50  
Statistical Case Studies S&DS 425/625 Jay Emerson Fri 9:25-11:15  
Senior Project S&DS 492b Brian MacDonald -  
Selected Topics in Statistical Decision Theory S&DS 411a/611b Harrison Zhou Mon 10:30-12:20  
Applied Machine Learning and Causal Inference Research Seminar S&DS 617 Jas Sekhon Mon 1:30-3:20  
Asymptotic Statistics S&DS 618 Zongming Ma Tues 7:00-8:50  
Advanced Optimization Techniques S&DS 432b/632b Zhuoran Yang Tues, Thurs 1:00-2:15  
Statistical Methods in Computational Biology S&DS 645b/BIS 692b/CB&B 692 Hongyu Zhao Thurs 10:00-11:50  
Applied Spatial Statistics S&DS 674b/F&ES 781b Tim Gregoire Tues, Thurs 10:30-11:50  
Information-theoretic methods in high-dimensional statistics S&DS 677b Yihong Wu Tues 4:00-5:50  
Statistics and Data Science Computing Laboratory (1/2 credit) S&DS 110b/510b
not taught this year
YData: Text Data Science: An Introduction S&DS 171b/571b
not taught this year
YData: Analysis of Baseball Data S&DS 173b/573b
not taught this year
YData: Statistics in the Media S&DS 174b/574b
not taught this year
YData: COVID-19 Behavior S&DS 177b/577b
not taught this year
Theory of Probability and Statistics S&DS 239a/539a
not taught this year
Applied Machine Learning and Causal Inference S&DS 317b/517b
not taught this year
Design and Analysis of Algorithms CPSC 365b
not taught this year
Optimization Techniques S&DS 430a/630a ENAS 530a EENG 437a ECON 413a
not taught this year
Senior Seminar and Project S&DS 490a
not taught this year
Research Design and Causal Inference PLSC 508a
not taught this year
Applied Linear Models S&DS 531a
not taught this year
Intensive Algorithms S&DS 566
not taught this year
Introduction to Random Matrix Theory and Applications S&DS 615b
not taught this year
Spectral Graph Theory CPSC 662a
not taught this year
Probabilistic Networks, Algorithms, and Applications S&DS 667a
not taught this year
Nonparametric Estimation and Machine Learning S&DS 468b
not taught this year
Topics on Random Graphs MATH 670
not taught this year
Information Theory Tools in Probability and Statistics S&DS 672a
not taught this year
Topological Data Analysis S&DS 675a
not taught this year
Signal Processing for Data Science S&DS 676b
not taught this year
Function Estimation S&DS 679
not taught this year
High-Dimensional Function Estimation (prev title) S&DS 682a
not taught this year
Statistical Methods in Neuroimaging S&DS 683a
not taught this year
Research Seminar in Probability S&DS 699ab
not taught this year
Placeholder -- Monograph 706
not taught this year

Introductory Statistics (S&DS 100b/500b)
Instructor: Ethan Meyers
Time: Tues, Thurs 9:00-10:15
Place: TBD
An introduction to statistical reasoning. Topics include numerical and graphical summaries of data, data acquisition and experimental design, probability, hypothesis testing, confidence intervals, correlation and regression. Application of statistical concepts to data; analysis of real-world problems. A faster-paced version of this course with a higher level of computing is being created: See STAT 220a.
[back to top]

Introduction to Statistics (S&DS 101-109/501-509)
Instructor: Jonathan Reuning-Scherer and Staff
Time: Tues, Thurs 1:00-2:15
Place: YSB MARSH
Webpage:  http://www.stat.yale.edu/Courses/QR/stat101106.html
A basic introduction to statistics, including numerical and graphical summaries of data, probability, hypothesis testing, confidence intervals, and regression. Each course focuses on applications to a particular field of study and is taught jointly by two instructors, one specializing in statistics and the other in the relevant area of application. The first seven weeks of classes are attended by all students in STAT 101-106 together, as general concepts and methods of statistics are developed. The remaining weeks are divided into field-specific sections that develop the concepts with examples and applications. Computers are used for data analysis. These courses are alternatives; they do not form a sequence and only one may be taken for credit. No prerequisites beyond high school algebra. May not be taken after STAT 100 or 109.

Students enrolled in STAT 101-106 who wish to change to STAT 109, or those enrolled in STAT 109 who wish to change to STAT 101-106, must submit a course change notice, signed by the instructor, to their residential college dean by Friday, September 28. The approval of the Committee on Honors and Academic Standing is not required.

NEW for Fall 2019: S&DS 108: Introduction to Statistics: Advanced Fundamentals is available and allows students to earn 1/2 credit for completing one of the field-specific courses during the second half of the semester.
[back to top]

Introduction to Statistics: Life Sciences (S&DS 101a/501a E&EB 210aG/MCDB 215a)
Instructor: Jonathan Reuning-Scherer and Walter Jetz
Time: Tues, Thurs 1:00-2:15
Place: OML 202
Statistical and probabilistic analysis of biological problems presented with a unified foundation in basic statistical theory. Problems are drawn from genetics, ecology, epidemiology, and bioinformatics.
[back to top]

Introduction to Statistics: Political Science (S&DS 102a/502a EP&E 203a/PLSC 425a)
Instructor: Jonathan Reuning-Scherer
Time: Tues, Thurs 1:00-2:15
Place: OML 202
Statistical analysis of politics and quantitative assessments of public policies. Problems presented with reference to a wide array of examples: public opinion, campaign finance, racially motivated crime, and health policy.
[back to top]

Introduction to Statistics: Social Sciences (S&DS 103a/503a SOCY 119a)
Instructor: Jonathan Reuning-Scherer
Time: Tues, Thurs 1:00-2:15
Place: OML 202
Descriptive and inferential statistics applied to analysis of data from the social sciences. Introduction of concepts and skills for understanding and conducting quantitative research.
[back to top]

Introduction to Statistics: Medicine (S&DS 105a/505a)
Instructor: Jonathan Reuning-Scherer and Ethan Meyers
Time: Tues, Thurs 1:00-2:15
Place: OML 202
Statistical methods used in medicine and medical research. Practice in reading medical literature competently and critically, as well as practical experience performing statistical analysis of medical data.
[back to top]

Introduction to Statistics: Data Analysis (S&DS 106a/506a)
Instructor: Jonathan Reuning-Scherer and Bob Wooster
Time: Tues, Thurs 1:00-2:15
Place: 
An introduction to Probability and Statistics with emphasis on data analysis.
[back to top]

Introduction to Statistics: Advanced Fundamentals (S&DS 108a)
Instructor: Jonathan Reuning-Scherer
Time: Tues, Thurs 1:00-2:15
Place: TBD
More advanced concepts and methods in statistics. Meets for the second half of the term only. May not be taken after STAT 100 or after completing 101-106 or after more advanced coursework.
[back to top]

Introduction to Statistics: Fundamentals (S&DS 109a)
Instructor: Jonathan Reuning-Scherer
Time: Tues, Thurs 1:00-2:15
Place: OML 202
General concepts and methods in statistics. Meets for the first half of the term only. May not be taken after STAT 100 or 101-106.
[back to top]

R for Statistical Computing and Data Science (S&DS 110/510)
Instructor: Jay Emerson
Time: Mon, Wed 2:30-3:45
Place: TEAL?
This will now be revived as a full-credit course. It will have a limited number of seats for Yale College students, and will include graduate students from S&DS. The class provides an intensive introduction to programming with the R statistical language, based on the S language developed at Bell Labs by John Chambers and Richard Becker. It has become the accepted language for advanced statistical computing and data science in both industry and a wide range of academic disciplines.
[back to top]

[ Statistics and Data Science Computing Laboratory (1/2 credit) (S&DS 110b/510b) ]
[back to top]

An Introduction to R? for Statistical Computing and Data Science (1 credit?) (S&DS 110/510a)
Instructor: Elena Khusainova
Time: TBD
Place: TBD
This is a 1/2 credit course that meets for the first 7 weeks of the semester. The class provides an introduction to the R statistical language, based on the S language developed at Bell Labs by John Chambers and Richard Becker. It has become the accepted language for advanced statistical computing and data sciencei in both industry and a wide range of academic disciplines.
[back to top]

YData (S&DS 123b)
Instructor: Ethan Meyers
Time: Tues, Thurs 2:30-3:45
Place: TBD
Computational, programming, and statistical skills are no longer optional in our increasingly data-driven world; these skills are essential for opening doors to manifold research and career opportunities. This course aims to dramatically enhance knowledge and capabilities in fundamental ideas and skills in data science, especially computational and programming skills along with inferential thinking. YData is an introduction to Data Science that emphasizes the development of these skills while providing opportunities for hands-on experience and practice. YData is accessible to students with little or no background in computing, programming, or statistics, but is also engaging for more technically oriented students through extensive use of examples and hands-on data analysis. Python 3, a popular and widely used computing language, is the language used in this course. The computing materials will be hosted on a special purpose web server.
[back to top]

Foreign Assistance to Sub-Saharan Africa: Archival Data Analysis (S&DS 138b/AFST 378/EVST 378/AFST 570)
Instructor: Russell Barbour
Time: TBD
Place: TBD
[back to top]

Data Science Ethics (S&DS 150)
Instructor: Elisa Celis
Time: Mon 9:25-11:15
Place: TBD
Needed.
[back to top]

[ YData: Text Data Science: An Introduction (S&DS 171b/571b) ]
[back to top]

YData: Data Science for Political Campaigns (S&DS 172b/572b PLSC347b/524b)
Instructor: Joshua Kalla
Time: Wed 1:30-3:20
Place: TBD
Political campaigns have become increasingly data driven. Data science is used to inform where campaigns compete, which messages they use, how they deliver them, and among which voters. In this course, we explore how data science is being used to design winning campaigns. Students gain an understanding of what data is available to campaigns, how campaigns use this data to identify supporters, and the use of experiments in campaigns. This course provides students with an introduction to political campaigns, an introduction to data science tools necessary for studying politics, and opportunities to practice the data science skills presented in S&DS 123, YData. Prerequisite: S&DS 123, which may be taken concurrently. 0.5 Yale College course credit(s)
[back to top]

[ YData: Analysis of Baseball Data (S&DS 173b/573b) ]
[back to top]

[ YData: Statistics in the Media (S&DS 174b/574b) ]
[back to top]

YData: Measuring Culture (S&DS 175b/575b)
Instructor: Daniel Karell
Time: Thurs 3:30-5:20
Place: TBD
Prerequisite: S&DS 123, which may be taken concurrently.
[back to top]

YData: Humanities Data Mining (S&DS 176b/576b)
Instructor: Peter Leonard
Time: Tues, Thurs 1:00-2:15
Place: TBD
Prerequisite: S&DS 123, which may be taken concurrently.
[back to top]

[ YData: COVID-19 Behavior (S&DS 177b/577b) ]
[back to top]

YData: Foundations of Sociogenomics (S&DS 178/578 SOCY 362)
Instructor: Ramina Sotoudeh
Time: Wed 1:00-3:20
Place: TBD
[back to top]

YData: Data Science Applications in Insurance (S&DS 179)
Instructor: Perry Beaumont
Time: Tues, Thurs 9:00-10:15
Place: TBD
[back to top]

Intensive Introductory Statistics and Data Science (S&DS 220b/520b)
Instructor: Bob Wooster
Time: Tues, Thurs 9:00-10:15
Place: TBD
Introduction to statistical reasoning for students with particular interest in data science and computing. Using the R language, topics include exploratory data analysis, probability, hypothesis testing, confidence intervals, regression, statistical modeling, and simulation. Computing taught and used extensively, as well as application of statistical concepts to analysis of real-world data science problems. MATH 115 is helpful, but not required.
[back to top]

Data Exploration and Analysis (S&DS 230/530 PLSC 530)
Instructor: Ethan Meyers
Time: Tues, Thurs 9:00-10:15
Place: ML 211
Survey of statistical methods: plots, transformations, regression, analysis of variance, clustering, principal components, contingency tables, and time series analysis. The R computing language and Web data sources are used. After STAT 100 or the equivalent or with permission from the instructor; students without prior coursework in statistics should take STAT 100, 10X, or 200.
[back to top]

Data Exploration and Analysis (S&DS 230b/530b PLSC 530b)
Instructor: Jonathan Reuning-Scherer
Time: Tues, Thurs 9:00-10:15
Place: TBD
Survey of statistical methods: plots, transformations, regression, analysis of variance, clustering, principal components, contingency tables, and time series analysis. The R computing language and Web data sources are used. After STAT 100 or the equivalent or with permission from the instructor; students from STAT 200 may be permitted in 230 but are encouraged to take 361 and/or 325.
[back to top]

Probability and Bayesian Statistics (S&DS 238/538)
Instructor: Joe Chang
Time: Tues, Thurs 1:00-2:15
Place: 17HLH 101 - TEAL
Fundamental principles and techniques of probabilistic thinking, statistical modeling, and data analysis. Essentials of probability, including conditional probability, random variables, distributions, law of large numbers, central limit theorem, and Markov chains. Statistical inference with emphasis on the Bayesian approach: parameter estimation, likelihood, prior and posterior distributions, Bayesian inference using Markov chain Monte Carlo. Introduction to regression and linear models. Computers are used for calculations, simulations, and analysis of data.

Prerequisite: knowledge of single variable calculus is assumed. Some brief acquaintance with multivariable calculus (e.g. double integrals) and matrices would also be helpful but are not required.
Extra: STAT 238 Extra Session,  Tues 6:30-8:00,  24 Hillhouse
[back to top]

[ Theory of Probability and Statistics (S&DS 239a/539a) ]
[back to top]

Probability for Data Science (S&DS 240/540)
Instructor: Bob Wooster
Time: Mon, Wed 2:30-3:45
Place: ML 211
Introduction to probability theory, not for the major.
[back to top]

Probability Theory with Applications (S&DS 241/541 MATH 241)
Instructor: Yihong Wu
Time: Mon, Wed 9:00-10:15
Place: DAVIES AUD
Introduction to probability theory. Topics include probability spaces, random variables, expectations and probabilities, conditional probability, independence, discrete and continuous distributions, central limit theorem, Markov chains, and probabilistic modeling.
Extra: STAT 241 TA Session,  Thurs 6:30-7:30,  24 Hillhouse
[back to top]

Theory of Statistics (S&DS 242b/542b)
Instructor: Bob Wooster
Time: Mon, Wed 9:00-10:15
Place: TBD
Study of the principles of statistical analysis. Topics include maximum likelihood, sampling distributions, estimation, confidence intervals, tests of significance, regression, analysis of variance, and the method of least squares. Some statistical computing.
[back to top]

Computational Tools for Data Science (S&DS 262/562)
Instructor: Roy Lederman
Time: Mon, Wed 1:00-2:15
Place: TBD
Assumes math chops and some type of programming.
[back to top]

Introductory Machine Learning (S&DS 265/565)
Instructor: John Lafferty
Time: Tues, Thurs 11:30-12:50
Place: WLH 201
BRAINSTORMING: Fewer mathematical prerequisites, for the certificate and not for the major?
[back to top]

Deep Learning, 265-565-level? (S&DS 266/566)
Instructor: Lu Lu
Time: Tues, Thurs 4:00-5:15
Place: TBD
[back to top]

Neural Data Analysis (S&DS 280/580)
Instructor: Ethan Meyers
Time: Tues, Thurs 2:30 - 3:45
Place: TBD
Course description: In this class we will discuss data analysis methods that are used in the neuroscience community. Methods that we will discuss include classical descriptive and inferential statistics, point process models, mutual information measures, machine learning (neural decoding) analyses, dimensionality reduction methods, and representational similarity analyses. Each week we will read a research paper that uses one of these methods, and we will replicate these analyses using the R programming language. The emphasis of the courses will be on analyzing neural spiking data, although we will also discuss other imaging modalities such as magneto/electro-encephalography (EEG/MEG), two-photon imaging, and possibility functional magnetic resonance imaging data (fMRI). Data we will analyze include smaller datasets, such as single neuron recordings from songbird vocal motor system, as well as larger data sets, such as the Allen Brain observatory’s simultaneous recordings from the mouse visual system. Experience using programming to analyze data is required, and background in basic neuroscience is recommended.
[back to top]

Linear Models (S&DS 312/612)
Instructor: Zongming Ma
Time: Mon, Wed 11:35-12:50
Place: DL 220
The geometry of least squares; distribution theory for normal errors; regression, analysis of variance, and designed experiments; numerical algorithms, with particular reference to the R statistical language.

After STAT 242 and MATH 222 or 225.

No final exam.
[back to top]

Introduction to Causal Inference (S&DS 314b)
Instructor: Winston Lin
Time: Tues, Thurs 4:00-5:15
Place: TBD
Introduction to causal inference with applications to the social and health sciences. Topics include randomized experiments, matching and propensity score methods, sensitivity analysis, instrumental variables, and regression discontinuity designs. Mathematical problems, data analysis in R, and critical discussions of published applied research.

Prerequisite: S&DS 242 and some programming experience in R.
[back to top]

Measuring Impact and Opinion Change (S&DS 315a/PLSC 340a)
Instructor: Josh Kalla?
Time: TBD
Place: TBD
[back to top]

Topics in the Design and Analysis of Experiments (S&DS 316a/516a)
Instructor: Winston Lin
Time: 
Place: 
[back to top]

[ Applied Machine Learning and Causal Inference (S&DS 317b/517b) ]
[back to top]

Stochastic Processes (S&DS 351b/551b)
Instructor: Ilias Zadik
Time:  Mon, Wed 1:00-2:15
Place: TBD
Introduction to the study of random processes, including Markov chains, Markov random fields, martingales, random walks, Brownian motion, and diffusions. Techniques in probability, such as coupling and large deviations. Applications chosen from image reconstruction, Bayesian statistics, finance, probabilistic analysis of algorithms, and genetics and evolution.
[back to top]

Biomedical Data Science, Mining and Modeling (S&DS 352/MCDB 452)
Instructor: Mark Gerstein and Matthew Simon
Time: Mon, Wed 1:00-2:15
Place: 
[back to top]

Data Analysis (S&DS 361b/661b)
Instructor: Brian MacDonald
Time: Tues, Thurs 9:00-10:15
Place: TBD
Selected topics in statistics explored through analysis of data sets using the R statistical computing language. Topics include linear and nonlinear models, maximum likelihood, resampling methods, curve estimation, model selection, classification, and clustering.

After or concurrently with STAT 242 and MATH 222 or 225, or equivalents.
[back to top]

Multivariate Statistics for Social Sciences (S&DS 363b/563b)
Instructor: Jonathan Reuning-Scherer
Time: Tues, Thurs 1:00-2:15
Place: TBD
Introduction to the analysis of multivariate data as applied to examples from the social sciences. Topics include principal components analysis, factor analysis, cluster analysis (hierarchical clustering, k-means), discriminant analysis, multidimensional scaling, and structural equations modeling. Extensive computer work using either SAS or SPSS programming software.

Prerequisites: knowledge of basic inferential procedures and experience with linear models.
[back to top]

Information Theory (S&DS 364b/664b)
Instructor: Andrew Barron
Time: Tues, Thurs 11:35-12:50
Place: TBD
Foundations of information theory in mathematical communications, statistical inference, statistical mechanics, probability, and algorithmic complexity. Quantities of information and their properties: entropy, conditional entropy, divergence, redundancy, mutual information, channel capacity. Basic theorems of data compression, data summarization, and channel coding. Applications in statistics and finance. After Statistics 241.
[back to top]

Intermediate Machine Learning (S&DS 365/665)
Instructor: John Lafferty
Time: Mon, Wed 1:00 - 2:15
Place: LC 101
Techniques for data mining and machine learning are covered from both a statistical and a computational perspective, including support vector machines, bagging, boosting, neural networks, and other nonlinear and nonparametric regression methods. The course will give the basic ideas and intuition behind these methods, a more formal understanding of how and why they work, and opportunities to experiment with machine learning algorithms and apply them to data. After STAT 242b.
[back to top]

Applied Data Mining and Machine Learning (S&DS 365b/665b)
Instructor: Sahand Negahban
Time: Mon, Wed 11:35-12:50
Place: TBD
Techniques for data mining and machine learning are covered from both a statistical and a computational perspective, including support vector machines, bagging, boosting, neural networks, and other nonlinear and nonparametric regression methods. The course will give the basic ideas and intuition behind these methods, a more formal understanding of how and why they work, and opportunities to experiment with machine learning algorithms and apply them to data. After STAT 242b.
[back to top]

[ Design and Analysis of Algorithms (CPSC 365b) ]
[back to top]

Advanced Probability (S&DS 400/600 MATH 330)
Instructor: Sekhar Tatikonda
Time: Tues, Thurs 2:30-3:45
Place: WTS A51
Measure theoretic probability, conditioning, laws of large numbers, convergence in distribution, characteristic functions, central limit theorems, martingales. Some knowledge of real analysis is assumed.
[back to top]

Statistical Inference (S&DS 410/610)
Instructor: Harrison Zhou
Time: Tues, Thurs 11:35-12:50
Place: LUCE 202
A systematic development of the mathematical theory of statistical inference covering methods of estimation, hypothesis testing, and confidence intervals. An introduction to statistical decision theory. Undergraduate probability at the level of Statistics 241a assumed.
[back to top]

Statistical Case Studies (S&DS 425/625)
Instructor: Jay Emerson
Time: Fri 9:25-11:15
Place: TBD
Statistical analysis of a variety of statistical problems using real data. Emphasis on methods of choosing data, acquiring data, assessing data quality, and the issues posed by extremely large data sets. Extensive computations using R. This is a seminar of limited size but this iteration gives priority to graduate students. A final project is required. S&DS or Applied Math majors who previously took Statistical Case Studies are not permitted to take this course.
[back to top]

Statistical Case Studies (S&DS 425)
Instructor: Brian MacDonald
Time: Mon, Wed 1:00 - 2:15
Place: 17HLH 111
Webpage:  https://classesv2.yale.edu/
Statistical analysis of a variety of statistical problems using real data. Emphasis on methods of choosing data, acquiring data, assessing data quality, and the issues posed by extremely large data sets. Extensive computations using R. This is a seminar of limited size and is not for capstone credit.
[back to top]

[ Optimization Techniques (S&DS 430a/630a ENAS 530a EENG 437a ECON 413a) ]
[back to top]

Indep Study (S&DS 480ab)
Instructor: Staff
Time: -
Place: -
Directed individual study for qualified students who wish to investigate an area of statistics not covered in regular courses. A student must be sponsored by a faculty member who sets the requirements and meets regularly with the student. Enrollment requires a written plan of study approved by the faculty adviser and the director of undergraduate studies.

Permission required. No final Exam.
[back to top]

[ Senior Seminar and Project (S&DS 490a) ]
[back to top]

Senior Seminar and Project (S&DS 490b)
Instructor: TBD or not offered
Time: TBD
Place: 24 Hillhouse Room 107
Under the supervision of a member of the faculty, each student works on an independent project. Students participate in seminar meetings at which they speak on the progress of their projects.

Permission required. No final Exam.
[back to top]

Senior Project (S&DS 491)
Instructor: Brian MacDonald
Time: -
Place: -
Individual research that fulfills the S&DS senior requirement. Requires a faculty adviser and DUS permission. The student must submit a written report about results of the project.
[back to top]

Senior Project (S&DS 492b)
Instructor: Brian MacDonald
Time: -
Place: -
Individual research that fulfills the S&DS senior requirement. Requires a faculty adviser and DUS permission. The student must submit a written report about results of the project.
[back to top]

[ Research Design and Causal Inference (PLSC 508a) ]
[back to top]

Design-Based Inference for the Social Sciences (PLSC 528b)
Instructor: Peter Aronow
Time: Mon 3:30-5:20
Place: TBD
Introduction to design-based statistical approaches to survey sampling and causal inference. Design and analysis of complex survey samples and randomized experiments, including model-assisted approaches. Discussion of recent advances in this paradigm, including inference in network settings. Prerequisite: knowledge of statistical theory at the level of PLSC 500 is assumed, with familiarity with probability and estimation theory. Alternative prerequisite courses include S&DS 542 or ECON 550.
[back to top]

[ Applied Linear Models (S&DS 531a) ]
[back to top]

Theory of Statistics (S&DS 542)
Instructor: Andrew Barron
Time: Tues, Thurs 9:00-10:15?
Place: 
Principles of statistical analysis: maximum likelihood, sampling distributions, estimation, confidence intervals, tests of significance, regression, analysis of variance, and the method of least squares. Intended for Statistics Masters students; others may be admitted with consent of instructor. After or concurrently with Statistics 541a.
[back to top]

[ Intensive Algorithms (S&DS 566) ]
[back to top]

Topics in Deep Learning: Methods and Biomedical Applications (S&DS 567 CB&B 567)
Instructor: Martin Renqiang and Mark Gerstein
Time: Mon 9:00-11:15
Place: TBD
This course provides an introduction to recent developments in deep learning, covering topics ranging from basic backpropagation, optimization, to latest developments in deep generative models and network robustness. Applications in Natural Language Processing and Computer Vision will be used as running examples. Several case studies in biomedical applications will be covered in details.
[back to top]

Numerical Linear Algebra: Deterministic and randomized algorithms (S&DS 569)
Instructor: Roy Lederman
Time: Mon, Wed 9:00-10:15
Place: TBD
TBD
[back to top]

Selected Topics in Statistical Decision Theory (S&DS 411a/611b)
Instructor: Harrison Zhou
Time: Mon 10:30-12:20
Place: TBD
In this course we will review some recent developments in statistical decision theory including nonparametric estimation, high dimensional (non)linear estimation, low rank and sparse matrices estimation, covariance matrices estimation, graphical models, and network analysis.
[back to top]

[ Introduction to Random Matrix Theory and Applications (S&DS 615b) ]
[back to top]

Causal Inference and Research Design (S&DS 616 PLSC 508)
Instructor: P Aronow
Time: Thurs 4:00-5:50
Place: TBD
Research seminar
[back to top]

Applied Machine Learning and Causal Inference Research Seminar (S&DS 617)
Instructor: Jas Sekhon
Time: Mon 1:30-3:20
Place: TBD
Research seminar, graduate number needed.
[back to top]

Asymptotic Statistics (S&DS 618)
Instructor: Zongming Ma
Time: Tues 7:00-8:50
Place: TBD
[back to top]

Statistical Case Studies (S&DS 625)
Instructor: Brian MacDonald
Time: Mon, Wed 2:30-3:45
Place: 17HLH 101 - TEAL
Webpage:  https://classesv2.yale.edu/
Statistical analysis of a variety of statistical problems using real data. Emphasis on methods of choosing data, acquiring data, assessing data quality, and the issues posed by extremely large data sets. Extensive computations using R. Limited size, with permission from the instructor required. This course is likely limited to graduate students in S&DS; undergraduate version 425 is available.
[back to top]

Practical Work (S&DS 626ab)
Instructor: DGS
Time: -
Place: -
Individual one-semester projects, with students working on studies outside the Department, under the guidance of a statistician. This course is a one-credit requirement for the Ph.D. degree.
[back to top]

Statistical Consulting (S&DS 627a/628b)
Instructor: Jay Emerson
Time: Fri 2:30-4:30
Place: 24 Hillhouse
Webpage:  http://www.stat.yale.edu/~jay/627.html
Statistical consulting and collaborative research projects often require statisticians to explore new topics outside their area of expertise. This course exposes students to real problems, requiring them to draw on their expertise in probability, statistics, and data analysis. Students complete the course with individual projects supervised jointly by faculty outside the department and by one of the instructors. Students enroll for both terms and receive one credit at the end of the year.
[back to top]

Computation and Optimization (S&DS 431/631)
Instructor: Zhuoran Yang
Time: Tues, Thurs 1:00-2:15
Place: TBD
[back to top]

Advanced Optimization Techniques (S&DS 432b/632b)
Instructor: Zhuoran Yang
Time: Tues, Thurs 1:00-2:15
Place: TBD
[back to top]

Statistical Methods in Computational Biology (S&DS 645b/BIS 692b/CB&B 692)
Instructor: Hongyu Zhao
Time: Thurs 10:00-11:50
Place: TBD
Probability modeling and statistical methodology for the analysis of human genetics data are presented. Topics include population genetics, single locus and polygenic inheritance, linkage analysis, genome-wide association studies, quantitative trait locus analysis, rare variant analysis, and genetic risk predictions. Offered every other year. Prerequisite: EPH 505 and BIS 505, or equivalents, and permission from the instructor.
[back to top]

Markov chains for sampling and optimization (S&DS 652a)
Instructor: Andrew Barron
Time: Tues, Thurs 1:00 - 2:15
Place: 
[back to top]

Statistical Computing (S&DS 662)
Instructor: Jay Emerson (rethink)
Time: Tues, Thurs 4:00-5:15
Place: 17HLH 101 - TEAL
Topics in the practice of data analysis and statistical computing, with particular attention to problems involving massive data sets or large, complex simulations and computations. Progamming with R, C/C++, and Python, computational efficiency, memory management, interactive and dynamic graphics, and parallel computing.
[back to top]

[ Spectral Graph Theory (CPSC 662a) ]
[back to top]

Computational Mathematics for Data Science (S&DS 663)
Instructor: Roy Lederman
Time: Mon-Wed 1:00-2:15
Place: WLH 211
Using a computer to analyze data? Will the computer ever finish processing? Will the result be junk? Will you recognize that it is junk? We will discuss the difference between math on paper and math on a computer and the difference between programming and mathematical programming. The primary approach to our investigation will be making mistakes and analyzing them. We will experience benign mathematical operations failing catastrophically without telling us. We will experience equivalent operations taking anywhere between a fraction of a second and a lifetime. We will practice survival techniques for this harsh environment. We will discuss topics in numerical computation, complexity, programming, and prototyping. Assignments include theory, programming, data analysis, individual work, and collaborative work. We will use C (and/or FORTRAN) and Python or Julia. This is not a programming course or a course in debugging. Prerequisites: linear algebra, multivariate calculus, and some experience in programming (any language). Instructor permission is required.
[back to top]

[ Probabilistic Networks, Algorithms, and Applications (S&DS 667a) ]
[back to top]

[ Nonparametric Estimation and Machine Learning (S&DS 468b) ]
[back to top]

Statistical Learning Theory (S&DS 669a)
Instructor: Sahand Negahban?
Time: TBD
Place: 24 Hillhouse
Introduction to theoretical analysis of machine learning algorithms. Focus on the statistical and computational aspects. Will cover subjects such as decision theory, empirical process theory, and convex optimization. Prerequisites linear algebra, multivariable calculus, stochastic processes, and introduction to machine learning such as Stat 365b or a similar course.
[back to top]

Theory of Deep Learning (S&DS 670a)
Instructor: Andrew Barron
Time: Mon, Wed 9:00-10:15
Place: 24 Hillhouse
Deep neural networks and related statistical learning theory are developed for high-dimensional function estimation and classification. Complexity, approximation capability, statistical accuracy, penalized least squares and stochastic optimization are explored. Students will be expected to propose a topic of investigation (e.g. of literature or computational or theoretical exploration of discussed methods) and to provide a final report. This course is intended for students with background in probability, statistics, and computation.
[back to top]

[ Topics on Random Graphs (MATH 670) ]
[back to top]

[ Information Theory Tools in Probability and Statistics (S&DS 672a) ]
[back to top]

Applied Spatial Statistics (S&DS 674b/F&ES 781b)
Instructor: Tim Gregoire
Time: Tues, Thurs 10:30-11:50
Place: TBD
An introduction to spatial statistical techniques with computer applications. Topics include spatial sampling, visualizing spatial data, quantifying spatial association and autocorrelation, interpolation methods, fitting variograms, kriging, and related modeling techniques for spatially correlated data. Examples are drawn from ecology, sociology, public health, and subjects proposed by students. Four to five lab/homework assignments and a final project. The class makes extensive use of the R programming language as well as ArcGIS.
[back to top]

[ Topological Data Analysis (S&DS 675a) ]
[back to top]

[ Signal Processing for Data Science (S&DS 676b) ]
[back to top]

Information-theoretic methods in high-dimensional statistics (S&DS 677b)
Instructor: Yihong Wu
Time: Tues 4:00-5:50
Place: TBD
The interplay between information theory and statistics is a constant theme in the development of both fields. This course will discuss how techniques rooted in information theory play a key role in understanding the fundamental limits of high-dimensional statistical problems in terms of minimax risk and sample complexity. In particular, we will rigorously justify the phenomena of dimensionality reduction by either intrinsic low-dimensionality (sparsity, smoothness, shape, etc) or - the less familiar - extrinsic low-dimensionality (functional estimation). Complementing this objective of understanding the fundamental limits, another significant direction is to develop computationally efficient procedures that attain the statistical optimality, or to understand the lack thereof.
[back to top]

[ Function Estimation (S&DS 679) ]
[back to top]

[ High-Dimensional Function Estimation (prev title) (S&DS 682a) ]
[back to top]

[ Statistical Methods in Neuroimaging (S&DS 683a) ]
[back to top]

Statistical Inference on Graphs (S&DS 684)
Instructor: Yihong Wu?
Time: Tues 4:00-5:50
Place: 17 Hillhouse 03
An emerging research thread in statistics and machine learning deals with finding latent structures from data represented in graphs or matrices. This course will provide an introduction to mathematical and algorithmic tools for studying such problems. We will discuss information-theoretic methods for determining the fundamental limits, as well as methodologies for attaining these limits, including spectral methods, semidefinite programming relaxations, message passing algorithms, etc. Specific topics will include spectral clustering, planted clique and partition problem, sparse PCA, community detection on stochastic block models, statistical-computational tradeoffs.
[back to top]

Theory of Reinforcement Learning (S&DS 685)
Instructor: Zhuoran Yang?
Time: Fri 9:25-11:15
Place: TBD
Description here
[back to top]

High-dimensional phenomena in statistics and learning (S&DS 686)
Instructor: Zhou Fan?
Time: Wed 4:00-6:30
Place: 24 Hillhouse
Description here
[back to top]

TBD (S&DS 687)
Instructor: Zongming Ma
Time: TBD
Place: TBD
Description here
[back to top]

Computational and Statistical Trade-offs in High Dimensional Statistics (S&DS 688)
Instructor: Ilias Zadik
Time: Tues 4:00-5:50
Place: TBD
Modern statistical tasks require the use of both computationally efficient and statistically accurate methods. But, can we always find a computationally efficient method that achieves the information-theoretic optimal statistical guarantees? If not, is this an artifact of our techniques, or a potentially fundamental source of computational hardness? This course will survey a new and growing research area studying such questions on the intersection of high dimensional statistics and theoretical computer science. We will discuss various tools to explain the presence of such “computational-to-statistical gaps” for several high dimensional inference models. These tools include the “low-degree polynomials” method, statistical query lower bounds and more. We will also discuss connections with other fields such as statistical physics and cryptography.
[back to top]

Scientific Machine Learning (S&DS 689)
Instructor: Lu Lu
Time: Tues, Thurs 4:00-5:15
Place: TBD
[back to top]

Independent Study or Topics Course (S&DS 690ab)
Instructor: DGS
Time: -
Place: -
By arrangement with faculty. Approval of Director of Graduate Studies required.
[back to top]

[ Research Seminar in Probability (S&DS 699ab) ]
[back to top]

Departmental Seminar (S&DS 700ab)
Instructor: -
Time: Mon 4:00-5:30
Place: 24 Hillhouse
Webpage:  http://www.stat.yale.edu/Seminars/2011-12/
Important activity for all members of the department. See webpage for weekly seminar announcements.
[back to top]

[ Placeholder -- Monograph (706) ]
[back to top]