Statistics 230/530: Data Analysis (spring 2003)
|
Survey of statistical methods: plots, transformations, regression,
analysis of variance, clustering, principal components, contingency
tables, and time series analysis. Uses SPLUS and Web data
sources. After or concurrent with Statistics 101-106.
|
|
| |
| Instructor: | David Pollard |
| Email: |
david.pollard@yale.edu
|
|---|
| Office
hours: | Tuesday 11:00 -12:00 at StatLab;
more if needed. |
| Time: | Monday,
Wednesday 1:00 - 2:45 (first session);
Monday, Wednesday 2:30 - 3:45 (second session).
|
| Place: |
StatLab (140 Prospect Street) |
| TAs: | Kun Gao and Eddie Valaitis |
16 April 2003
Folks,
We have three more classes after today. During that time I'll present
some material that I hope will help you with your final projects.
You should talk with me asap to arrange your project. It is important
that you do not take on something too ambitious for the time available:
the projects are due on 28 April.
I expect that many of you will want to use some part of the FARS data.
You must all work on different topics--no joint work.
The project will count as two problem sets for the final grade.
Please talk with me soon.
DP
Grading
The entire course is built around weekly homeworks and a final project (of
modest size). There are no exams.
Points scored on sheets 1 to 4.
Software
The course will be taught using S-Plus. No prior acquaintance with
the language needed. Yale students may obtain, FREE OF CHARGE, a copy of S-Plus to
install on a (Windows) personal computer. For details,
read the directions
on the StatLab website.
There is also an implementation of the S language for the Macintosh,
called R. It is not as polished as the commercial S-Plus product but
it is available for FREE from the
CRAN website. (I use R
on my Mac laptop.)
Texts and references
There is no single text for the course. Much of what you need to learn
about S-Plus can be found in the help menus of the program.
-
Notes
from Splus workshop taught by Marios Panayides.
-
Need more help with Splus? Try http://www.yale.edu/mstutor/statstutors.html.
- Notes
written by John Hartigan when he taught Statistics 230/530 in
spring 2002.
- Freedman,
Pisani and Purves, "Statistics".
A good source for statistical ideas that lie behind the scenes. At
the level of Statistics 101-106.
- Copious documentation in pdf format from the
S-Plus
web site.
- Krause
and Olson, "The Basics of S-Plus".
I have seen only an earlier version, which provided a good
introduction to an earlier version of S-Plus. When my copy of the
new edition arrives, my opinion might change.
- Venables
and Ripley, "Modern Applied Statistics with S".
Covers a lot about S and the underlying statistical theory. A standard
reference, but not easy reading without a good prior grasp of statistics.
Weekly topics
- Introduction to S, using data from Census 2000. [start]
[Homework due Wednesday
22 January]
- Wisconsin breast cancer. [start]
[Homework due Wednesday 29 January]
- Civil justice survey of state courts. [start]
[Homework due Wednesday 12
February]
- Yale student grades. [start]
[Homework due
Wednesday 5 March] [discussion of additive fits
to homework scores]
-
Figure skating. [start]
- Traffic fatalities: Fatal Analysis Reporting System
(FARS). [start]
Homework 5: due
Wed 2 April
Homework 6: due
Wed 9 April
-
DP: 12 Jan 2003