If you get lost on the Web: Look at: http://statlab.stat.yale.edu (look under ABOUT THE STATLAB for Course Materials then follow the links through to the syllabus for Stat 200.
To start Splus, you click on the Splus icon (little blue/green squares) in the Applications window. You will be asked if you want to clean up the data directory. Since this is your first time using Splus you probably should click "Yes". You will see in the Splus window that there is a command window with a > sign on the left, this is where the commands will be entered. We are now ready to try a few calculations.
(The > sign at the start of a line is the prompt; you don't type it. The responses that Splus gives have been included also).
> pi
[1] 3.141593
Yes, Splus knows the value of that famous constant
> x<-sqrt(3)
> x
[1] 1.732051
> v <- c(2,3,4)
> v
> M <- matrix(1:10,2,5)
> M
Note that Splus is case sensitive!
Some other arithmetic operators in Splus are: +, -, *, /, ^, %*%, etc.
Splus has many built in functions and you will become familiar with
these as the course progresses.
To illustrate the use of the mean
function try to find the difference between: mean(x),
mean(M),
mean(),
mean?
>help(asin)
>?asin
How many ways are there to get help on an Splus function?
Explain why
> asin(x/2)/pi
gives the value it does.
Get help on the q() function to find out how to quit from Splus.
> z <- seq(1.5,1.9,0.1)
> z
[1] 1.5 1.6 1.7 1.8 1.9
Vectors are indexed in many different ways, we illustrate four different ways, can you interpret them?
> z[2]
[1] 1.6
> z[1:3]
[1] 1.5 1.6 1.7
> z>pi/2
[1] F T T T T
> z[z>pi/2]
[1] 1.6 1.7 1.8 1.9
> z[-c(1,3)]
[1] 1.6 1.8 1.9
> names(z)<-c("First","Second","Third","Fourth","Fifth")
> z
First Second Third Fourth Fifth
1.5
1.6 1.7 1.8 1.9
> z[c("Second","Fifth")]
Second Fifth
1.6
1.9
What happened?
> M
> M[2,3]
> M[,3]
> M[2,]
> M[,-c(1,2)]
> M>5
[,1] [,2]
[,3] [,4] [,5]
[1,] F
F F T T
[2,] F
F T T T
> M[M>5]
[1] 6 7 8
9 10
How are matrices indexed, remember
the ways of indexing vectors?
> M + (M > 5)
How does Splus treat logical variables when included in arithmetic expressions?
> dim(M)
> dim(v)
>length(M)
>length(v)
How does the length function differ from the dim function?
Try the following two ways of creating a matrix:
> AA <- matrix(1:9,3,3)
> BB <- matrix(1:9,3,3,byrow=T)
> t(AA)
# t() is the transpose function
Now try some multiplications:
> AA * AA
> AA %*% AA
What do these two products represent? Try:
> mm%*%seq(5,length=5)
What happened?
What does the diag function do? Try:
> diag(1:5)
> diag(mm)
> ll <- list()
> ll$first <- "Hello there"
> ll$mat<-M
> ll$const<-pi
> ll$last <- "Goodbye"
> ll
$first:
[1] "Hello there"
$mat:
[,1] [,2]
[,3] [,4] [,5]
[1,] 1
3 5 7 9
[2,] 2
4 6 8 10
$const:
[1] 3.141593
$last:
[1] "Goodbye"
Here's a very simple function for you to try first:
> test1 <- function(m){
ave<-mean(m)
sum((m-ave)^2)
}
> test1(ll$mat)
[1] 82.5
What happened? Try feeding the function some other objects (x, v or M for example) You can easily change your functions by using the up-arrow key to recall the last lines that were entered. Could you edit test1 so that it outputs the mean of the object also? Think about lists!
> names(car.all)
> car <- car.all[sample(1:111,40),c("We","HP","Pri","Mil","Cou")]
You might want to check the help files for sample.
> car
> attributes(car)
> attributes(car$Country)
What did each of those commands do? What are attributes?
You now have a data frame called car whose 40 rows were sampled at random from the 111 rows of car.all, and whose 5 columns were identified uniquely by the first few letters of column names from car.all. (Splus tries very hard to make sense of what you type. If it can identify a unique component of a list or data frame from the first few letters it will not demand the whole name. Beware: there are some places where abbreviations can lead to results that you might not expect. Splus tries hard, but it can't read your mind.)
Try the following commands and interpret them carefully:
> table(car$Co)
> win.graph()
Yes, here's your graphics window!
This is how you plot variables:
> plot(car$W,car$P)
> plot(car$W,car$M)
> plot(1/car$W,car$M)
> pairs(car)
What was plotted in each case?
We will try and fit a linear model with mileage predicted by a linear function of weight. Here's how to do it:
> lm(car$Mil ~car$We)
An error message by any chance?
> lm(car$Mil ~car$We,na.action=na.omit)
Any better? Let's fit the model again but store the output in an Splus object.
> reg1 <- lm(car$Mil ~car$We,na.action=na.omit)
> reg1
Do you undertand the output?
> names(reg1)
So they are the names of the components of reg1
> plot(reg1$fit,reg1$res)
Yes, a plot of residuals versus fitted values.
Try modelling mileage as a linear function of weight and price:
> reg2 <- lm(car$Mil ~car$We+car$Pr,na.action=na.omit)
As you can see, there is a special shorthand for describing statistical models. The same shorthand is used for many statistical functions, but (unfortunately) not for all statistical functions. The bits of Splus that have survived from its original incarnation usually offer fewer bells and whistles than their fancier, more recent offspring. Splus is still growing.
> xvalues <- seq(-2,2,by=0.05)
> plot(xvalues,exp(-xvalues^2))
Use the help for "plot" to figure out how to get a smooth line for the plot (connect the dots). Improve the plot by figuring out how to put a title on the plot and change the axis labels.
You have been working for a while, creating S objects. This is what you have so far:
> objects()
[1] ".Last.fixed" ".Last.value"
"last.dump"
[4] "ll"
"M" "test1"
[7] "test2"
"test3" "values"
[10] "x"
"xvalues" "v"
[13] "z"
Now you need a break. You put your floppy disk into the slot (it becomes drive A:) , then dump (everything):
> dump(objects(),"A:\\jan16")
[1] "A:\\jan16"
For the purposes of a test, you kill everything, then try to restore it from your floppy disk.
> remove(objects())
> objects()
character(0)
Yes, they're gone but you can get them back!
> source("A:\\jan16")
> objects()
[1] ".Last.fixed" ".Last.value"
"last.dump"
[4] "ll"
"M" "test1"
[7] "test2"
"test3" "values"
[10] "x"
"xvalues" "v"
[13] "z"