The following data are taken from Box and Cox (1964), J. Royal Statistical Society B. They give the survival times in units of ten hours for experimental animals given various combinations of poison and treatment. Each poison (levels I, II, III) was administered to 4 animals for each treatment (levels A, B. C, D). The four columns of the following array correspond to the four treatments.
Poison I .31 .82 .43 .45 .45 1.10 .45 .71 .46 .88 .63 .66 .43 .72 .76 .62 Poison II .36 .92 .44 .56 .29 .61 .35 1.02 .40 .49 .31 .71 .23 1.24 .40 .38 Poison III .22 .30 .23 .30 .21 .37 .25 .36 .18 .38 .24 .31 .23 .29 .22 .33Remark: This is a famous example that is often used to demonstrate the effect of transformations.
factor(rep(LETTERS[1:4],12))to get the factors. Important: Check that your factors are lined up correctly with the survival times.
bc1<-aov(survival~poison+treatment,boxcox)The aov() function fits an additive model by least squares. If you (conceptually) rearrange survival into a three-way array SURVIVAL, indexed by treatments, poisons, and animals, then aov() is effectively finding vectors T.effect and P.effect to solve a least-squares problem:
Decompose SURVIVAL[treat,poison,k] = constant + T.effect[treat] + P.effect[poison] + residual[treat,poison,k] for treat = A, B, C, D poison = I, II, III k = animals 1,2,3,4 so as to minimize sum(residual^2).Of course you don't actually have to construct SURVIVAL explicitly--Splus does it all for you.
The additive fit would be perfect if all residuals were zero. The fit is considered reasonable if there are no "obvious patterns" in the residuals.
survival~poison*treatmentThe new formula has the effect of replacing T.effect[treat] + P.effect[poison] by a matrix TandP.effect[treat,poison]. That is, it fits the model with interactions.
How good is the fit this time?