Statistics 200 , fall 1997
Statistical Computing Laboratory
Instructors: Brendan Murphy and David Pollard
Class hours: Friday 2.30-5.00
Stat Lab, 140 Prospect
This class provides an introduction to the S-Plus statistical
language. S-Plus (marketed by
MathSoft) is based on the S language developed at Bell Labs by
John Chambers and Richard Becker. It has become the accepted
language for advanced statistical computing. New advances in
statistical methodology are routinely made available as S-routines.
Unlike statistical packages such as SAS, Systat, SPSS or BMDP, S is a
for expressing statistical models and statistical procedures. There are many
functions available in S for performing the routine statistical procedures such
as regression, analysis of variance, survival analysis, plotting, tables;
essence of S is that you can modify the available functions, and write your own
functions, to get what you want.
Students from other Statistics courses that assume knowledge of S-Plus are
encouraged to attend at least the first five weeks of Stat200.
- Special MathSoft offer: Students (with a valid Yale student-id) may purchase a copy of S-Plus
(version 4.0 for Windows) from the Statistics department for only $53 (list
price is much bigger).
Requirements of the course
Classroom exercises to be handed in each week.
You should be able to work through the material during the
Please bring a floppy disk to each session to save your
Topics to be covered
Click on the class number to view the corresponding handout for the
The first five weeks will provide a crash-course in S-Plus.
At the moment the links point to the lab sessions for last spring.
We will be making some changes to the details, and perhaps to the
ordering of sessions, as the course unfolds. (We like to learn from
our mistakes.) We will flag the sessions as [Fall97] when they have
Getting into and out of S-Plus. Help! Saving your work. Incorporating S-Plus
output in reports. Introduction to lists, vectors, matrices,
functions, and graphics. An illustrative example.
Reading data from other sources; data from WWW sites (such as
StatLib at Carnegie-Mellon University, and the
U.S. Census Bureau). Data frames. Evaluation frames.
Libraries. Search lists.
Fancier graphics: beyond the defaults. Multiple plots per page,
split screens, "graphics frames".
Manipulation of matrices, arrays, and tables. Cross-tabulation of
More about functions. Default arguments, variable numbers of
arguments, return values. Idiot-proofing. Looping and conditional computations.
End of crash course in S-plus
Data structures. Regression and model fitting. Manipulation
of lm objects.
Low-level graphics. Construction of customized plots.
Classes and methods.
Can you trust qqnorm()?.
Or: Look at: http://statlab.stat.yale.edu (look under ABOUT THE
STATLAB for Course Materials then follow the links through
to this page).