## Statistical Computing Laboratory

Instructor: David Pollard
Email: pollard@stat.yale.edu
Class hours: Friday 2.30-5.00 Stat Lab, 140 Prospect

This class provides an introduction to the S-Plus statistical language. S-Plus (marketed by MathSoft) is based on the S language developed at Bell Labs by John Chambers and Richard Becker. It has become the accepted language for advanced statistical computing. New advances in statistical methodology are routinely made available as S-routines.

Unlike statistical packages such as SAS, Systat, SPSS or BMDP, S is a language for expressing statistical models and statistical procedures. There are many functions available in S for performing the routine statistical procedures such as regression, analysis of variance, survival analysis, plotting, tables; however, the essence of S is that you can modify the available functions, and write your own functions, to get what you want.

• Students from other Statistics courses that assume knowledge of S-Plus are encouraged to audit the first four weeks of Stat200.
• Special MathSoft offer: Students (with a valid Yale student-id) may purchase a copy of S-Plus (version 3.3 for Windows) from the Statistics department for only \$53 (list price is \$495, with \$295 as the usual student rate). Enquiries to: Mrs. Kennedy.

### Requirements of the course

Classroom exercises to be handed in each week. You should be able to work through the material during the class time.

 Please bring a floppy disk to each session to save your work

### Topics to be covered

Click on the class number to view the corresponding handout for the lab session.
The first four weeks will provide a crash-course in S-Plus.
1. Getting into and out of S-Plus. Help! Saving your work. Incorporating S-Plus output in reports. Introduction to lists, vectors, matrices, functions, and graphics.
2. Reading data from other sources; data from WWW sites (such as StatLib at Carnegie-Mellon University, and the U.S. Census Bureau). Data frames. Evaluation frames. Libraries. Search lists.
3. Fancier graphics: beyond the defaults. Multiple plots per page, split screens, "graphics frames", maps.
4. More about functions. Default arguments, variable numbers of arguments, return values. Idiot-proofing. Looping and conditional computations.
End of crash course in S-plus
5. Manipulation of matrices and arrays.
6. Manipulation of data objects. Data structures. Regression and model fitting.
7. Low-level graphics. Construction of customized plots.
8. Trellis graphics.
9. Time series. (Including a little bit about classes and methods: object-oriented programming.)
10. Some specialized statistical techniques.
11. Special projects.
References:
• Richard A. Becker, John M. Chambers, and Allan R. Wilks (1988) The New S Language: A programming environment for data analysis and graphics. Wadsworth. [Slighly dated. Just for reference.]
• W. N. Venables and B. D. Ripley (1995) Modern Applied Statistics with S-Plus. Springer-Verlag. [Recommended text] See also the Complements on the WWW.
• Phil Spector (1994) An Introduction to S and S-Plus. Duxbury. [Much gentler introduction than Venables&Ripley. No coverage of more recent developments. Very little material specific to Windows version.]

URL: "http://www.stat.yale.edu/Courses/200fall.html"