If you get lost on the Web: Look at: http://statlab.stat.yale.edu (look under ABOUT THE STATLAB for Course Materials then follow the links through to the syllabus for Stat 200.
Start Splus by clicking on the icon with the litle blue (green?) squares. (Note: You do want to clean up the data directory, at least on your first session.) Try a few calculations. (The > sign at the start of a line is the prompt; you don't type it.) I have included some of the responses from Splus.) Follow along with the calculations, typing them in at your own machine. Try to understand the response from Splus. We will come around the class and ask you your interpretations.
> pi [1] 3.141593 > x<-sqrt(3) > x [1] 1.732051 > y <- c(2,3,4) > mm <- matrix(1:10,2,5)Interpret: x^y, y^x, y^mm, x^mm x+y, y+mm What is the difference between: mean(x), mean(mm), mean(), mean?
> z <- seq(1.5,1.9,0.1) > z [1] 1.5 1.6 1.7 1.8 1.9 > z[2] [1] 1.6 > z[1:3] [1] 1.5 1.6 1.7 > z>pi/2 [1] F T T T T > z[z>pi/2] [1] 1.6 1.7 1.8 1.9What happened?
>help(asin) >?asinHow many ways are there to get help on an Splus function?
Get help on the q() function to find out how to quit.
Explain why
> asin(x/2)/pigives the value it does.
> mm>5 [,1] [,2] [,3] [,4] [,5] [1,] F F F T T [2,] F F T T T > mm[mm>5] [1] 6 7 8 9 10 > mm + (mm > 5) # true = 1, false = 0 > dim(mm) > dim(y) >length(mm) >length(y)What does this say about how the matrix is stored?
Try:
> AA <- matrix(1:9,3,3) > BB <- matrix(1:9,3,3,byrow=T) > t(AA) # t() is the transpose functionNow try some multiplications:
> AA * AA > AA %*% AAWhat do these two products represent? Try:
> mm%*%seq(5,length=5)What happened?
What does diag do? Try:
> diag(1:5) > diag(mm)
> ll <- list() > ll$first <- "Hello there" > ll$m<-mm > ll$last <- "Goodbye" > ll $first: [1] "Hello there" $m: [,1] [,2] [,3] [,4] [,5] [1,] 1 3 5 7 9 [2,] 2 4 6 8 10 $last: [1] "Goodbye"
> test1 <- function(M){ ave<-mean(M) sum((M-ave)^2) } > test1(ll$m) [1] 82.5What happened? Try feeding the function some other objects. Try using the fix() function to construct your own version of test1. [It will probably take you a little while to get used to some of the quirks of fix().]
> names(car.all) > car <- car.all[sample(1:111,40),c("We","HP","Pri","Mil","Cou")] > car > attributes(car) #what are attributes? > attributes(car$Country)What did each of those commands do?
You now have a data frame called car whose 40 rows were sampled at random from the 111 rows of car.all, and whose 5 columns were identified uniquely by the first few letters of column names from car.all. (Splus tries very hard to make sense of what you type. If it can identify a unique component of a list or data frame from the first few letters it will not demand the whole name. Beware: there are some places where abbreviations can lead to results that you might not expect. Splus tries hard, but it can't read your mind.)
Try:
> table(car$Co) > win.graph() #start up a graphics device > plot(car$W,car$P) # plot prices against weights > plot(car$W,car$M) > plot(1/car$W,car$M) # what is plotted? > pairs(car) # plot all possible pairsTry a linear model with mileage predicted by a linear function of weight:
> lm(car$Mil ~car$We) # problems with missing values? > lm(car$Mil ~car$We,na.action=na.omit) # what does NA stand for?Same idea, but save the output for future reference:
> reg1 <- lm(car$Mil ~car$We,na.action=na.omit) > reg1 # look at it > names(reg1) # names of components > plot(reg1$fit,reg1$res) # a plot of residuals versus fitted valuesTry modelling mileage as a linear function of weight and price:
> reg2 <- lm(car$Mil ~car$We+car$Pr,na.action=na.omit)As you can see, there is a special shorthand for describing statistical models. The same shorthand is used for many statistical functions, but (unfortunately) not for all statistical functions. The bits of Splus that have survived from its original incarnation usually offer fewer bells and whistles than their fancier, more recent offspring. Splus is still growing.
> xvalues <- seq(-2,2,by=0.05) > plot(xvalues,exp(-xvalues^2))Problem: Use the help for "plot" to figure out how to get a smooth line for the plot (connect the dots). Fancier: Figure out how to put a title on the plot and change the axis labels.
I have been working for a while, creating S objects. This is what I have so far:
> objects() [1] ".Last.fixed" ".Last.value" "last.dump" [4] "ll" "mm" "test1" [7] "test2" "test3" "values" [10] "x" "xvalues" "y" [13] "z"Now I need coffee. I put my floppy disk into the slot (it becomes drive A:) , then dump (everything):
> dump(objects(),"A:\\sept4") [1] "A:\\sept4"For the purposes of a test, I kill everything, then try to restore it from my floppy disk.
> remove(objects()) # all gone: > objects() character(0) > source("A:\\sept4") # Back again: > objects() [1] ".Last.fixed" ".Last.value" "last.dump" [4] "ll" "mm" "test1" [7] "test2" "test3" "values" [10] "x" "xvalues" "y" [13] "z"