Create a data vector with a missing value,
junk <- c(1:10,NA).Explain the output from the following,
mean(junk) mean(junk,0,T) mean(junk,T) mean(junk, na.rm=T)Try to find some data form of input that mean() cannot handle.
Homework (Protecting your function from wrong input): (to be demonstrated to JAH or BM) Create a function mymean() that accepts the same arguments and has the same defaults as mean, but which gives some form of appropriate warnings if the user does silly things. Then add an another optional argument explain (which is false by default) to the function, so that your warnings appear only if the user calls mymean() with explain=T. Hint: Look at the help for the functions stop(), warning() and missing().
resample<- function(sample.size, replicates) { out <- vector() for(i in seq(1, replicates)) { samp <- rnorm(sample.size) out[i] <- mean(samp) } hist(out) }Check that you understand how the function works. What do each of the commands mean? You will notice that the function uses a for() loop, these are not very efficient in Splus, but there is usually a way of avoiding them.
Rewrite the function to remove the for() loop, by generating a matrix of random normals (using rnorm) with dim equal to c(replicates,sample.size), then apply() the mean function. What advantages or disadvantages do you see with each form of the function? Hint: Try some big numbers.
resample(50,trim = 0.25, xlab="samples of size 50")should use sample.size = 50, replicates = the default value, and it should calculate a 25% trimmed mean and write the xlab under the histogram. Make your function return something useful, such as a set of summary statistics for the generated (trimmed) means.
Hint: You need to find help on using variable numbers of arguments:
see "..." as described on page 95 of Venables & Ripley.
resample(-3,10) resample(month.name)What other sorts of input should you protect against? Hint: ?stop