Stat200 lab7

Statistics 200: Lab 7 (Friday 27 February 1998)

Today's tasks:
Low-level graphics. Construction of customized plots.

Much of what you have to do this week depends on the par() function. This function is used for manipulating the graphical parameters in Splus and it has so many options that is can get very confusing. The Splus help page for par() is rather long; it is hard to find information about a specific argument. We have created an abbreviated help page for par(). The following page includes a link to an html version of the Splus help page, which you can search using Netscape, you might find this a bit easier than using the Splus help page.

Warning: This lab can take a surprising amount of time, try not to spend too long on any one section, a few questions to BM or JH might speed you along.

Note: You can now easily manipulate the graphical parameters in Splus 4.0 We think that if you want to be able to reproduce your results easily, then using the par() command is a better idea.

Problem 1 (Learning a bit about `par()`)

Try the following commands and see what they tell you (some of these take a bit of thought!):

par("din"); par("fin");par("pin");par("usr")
plot(runif(100),runif(100))
par("din"); par("fin");par("pin");par("usr")
lines(c(0,1),c(0,1))
lines(c(0.5,1),c(0,2))
par(xpd=T)
lines(c(0.5,1),c(0,2))

Explain to BM or JH what is going on. What do the various par(...) settings control? Try resizing the graphics window (using the mouse), then repeat the experiment. What does an "inch" mean in the par settings?

Often you will find yourself wanting to fiddle with some par setting, to get some fine graphics effect. A good habit to get into, especially inside functions, is

oldpar <- par() # save all the old settings
# now make all your changes to par
# ...
# finished with the graphics
par(oldpar)  # restore the old settings as you leave

You could also save just those par values that will be affected, and then restore those values as you leave--see the examples at the end of the help page for par.

Problem 2 (A bit of fun, that might actually take a bit of time)

Draw the biggest circle on the graphics device that you can. Draw the largest rectangle that you can. Draw two diagonals (a `big X') of the largest rectangle. (The big X should have endpoints at the corners of the graphics device surface.) No cheating: don't use locator to find the corner points. A more difficult problem, that might involve a lot of effort is to try drawing the `big X' without changing the margin sizes. Hint: You can change the usr coordinates if you want to. (We found usr, plt, and xpd useful parameters.) For big circles, you could try symbols(), but the various arguments are tricky. We will settle for a square if you have too much trouble with circles. To orient yourself, you might take a peek at this rather garish picture showing the meanings of some important graphics parameters.

Problem 3 (Come back to this if you have time, your time would be better spent in the later problems)

(Optional. Skip if you are short of time. Actually, you should skip this problem if you are averse to pain and suffering. Split.screen() gets hard to control after a while. Maybe you should could use mfrow instead.) Write a function to draw the big X, as in the previous problem. Split the device into a 2 by 3 array of figures, using split.screen(). Draw a big X in the screens in the top left corner and the bottom right corner, with the screen number (1 or 6) written in the middle in a large letter.

The next three problems refer to an old maps data set (taken from a book by Andrews and Herzberg), which we will also be using next week. You can find the data in the maps section of the class library. You could also get it from the WWW, if you feel the need for more read.table() practice: From the StatLib---Andrews & Herzberg Archive , get the old maps data set (Table 10.1). Read it into a data frame. Warning: How can there be 78 rows if there are only 39 points? You will have a little work to do to get the data into the format you need.

Description of the data (taken from the Andrews and Herzberg book, page 63):

Great Lakes area. The data are taken from the eleven maps listed in Table 10.2. These maps are believed to be representative of the period of time commencing with the widespread knowledge that five major lakes existed in the interior of North America, and ending when relatively large scale hydrographic surveys of the lakes' shorelines were being done.

The data shown in Table 10.1 consist of the latitude, [phi], and longitude, [lambda], co-ordinates, as determined for each map, for each of 39 points easily identifiable on the eleven maps. These data were obtained by placing a grid over the old maps and doing a linear interpolation. Interpolation accuracy is felt to be good except for the indicated numbers. Also included are the current co-ordinates for the 39 points.

It is conjectured that there are five key ways a map might be systematically in error. These are: a constant error in latitude, a constant error in longitude, a proportional error in latitude, a proportional error in longitude, and error resulting in a non-zero angle between true North and the map's North. In addition, groups of locations, for example, one whole lake, may be off.

The primary task is to develop a methodology for parameterizing each map with respect to these characteristics and with respect to any other characteristics that seem to be important

Note: A minus sign indicates that the interpolation accuracy is not good.

Problem 4 (The usa() command and manipulating data to work with it)

Pick any of the old maps (for example, Coronelli 1688). Draw a plot of the actual points, as a map. Be sure that you have 'west' pointing to the left, and that the horizontal axis is labeled with the degrees west. If you are ambitious, try to superimpose a map of the US (using the usa() function), so that you can see where the landmarks are.

Note: The usa() command, is okay for drawing maps of large areas of the USA, but when you zoom into smaller areas it doesn't do as well. Some versions of Splus have got a maps library that contains more accurate maps of the USA. (I know that the full version of Splus 4.0 is supposed to have the maps library, the student version doesn't have the maps library and the UNIX versions of Splus since Splus 3.3 have these libraries).

Problem 5 (More plotting of maps)

Draw a plot showing both actual locations and the locations for one of the old maps, with arrows joining each actual point to its location on the old map. Warning: look at the data for funny values.

Problem 6 (Again if you've got time)

Pick an old map. Fit a linear model to both oldmap$lat and oldmap$long, using actual latitude and longitude as predictors. Draw a picture showing the actual locations, with arrows attached indicating the residuals from the linear fits.