Classical experimental design has its roots in agricultural applications. In this lab we will consider one such experiment conducted in 1932 to determine the effect of various levels of nitrogen and phosphorus additives on the productivity of potato plants. In this experiment, 6 treatments consisting of different combinations of nitrogeneous and phosphatic fertilizers were applied to 36 plots of land that formed a 6 by 6 square grid, on which potatoes were grown. The weight (in pounds) of potatoes harvested from each plot was recorded.
Let R be a factor with levels 1, 2, 3, 4, 5, and 6 indicating the row in which a given plot lies; let C be a factor also with levels 1, 2, 3, 4, 5, and 6 indicating the column in which the plot lies; and let T be a factor with levels A, B, C, D, E, and F indicating which of the 6 fertilizer treatments were applied to the plot. With this notation, the results of our experiment can be presented as follows (the numerical entry in each cell represents the weight of the potatoes harvested from the given plot with the particular treatment):
This data set is taken from pages 90--91 and 199--207 of The Design of Experiments by Sir Ronald A. Fisher, 7th edition (New York: Hafner, 1960), and it is also discussed on pages 411--412 and 588--589 of A Course in Probability and Statistics by Charles J. Stone (Belmont, Calif.: Wadsworth, 1995).
Our experiment is in the form of a Latin square; that is, each treatment is applied exactly once in each row and once in each column.
Read sections 11.1 and 11.2 of your text before you continue with the lab.
Let ,
and
denote the artificial
random variables with the design distribution, corresponding to
R, C, and T, respectively. Let
,
, and
denote spaces of all functions on
,
, and
, respectively, where each
.
Using the S commands lm and anova (see appendix), complete the followinga table.
What do you observe? Interpret your results.
How do the entries in the ``SS" column of this
table compare with the values of
,
, and
you obtained in part 2? How many parameters did you
estimate in the regression ?
So far, we have treated this experiment as though nothing were known about the quantities of nitrogeneous and phosphatic fertilizers that were combined to form our six treatments. Let N denote a factor with two levels 0 and 1, indicating the amount of nitrogeneous fertilizer in a given treatment; and let P denote a factor with three levels 0, 1 and 2 indicating the amount of phosphatic fertilizer in the treatment. We can now express the levels of T as a function of N and P as follows
Let and
denote the artificial
random variables with the design distribution, corresponding to
N and P, respectively. Let
and
denote spaces of all functions on
,
and
, respectively, and set
.
In the plot below the three lines represent the change in mean weight of potatoes as you switch from N=0 to N=1 holding P fixed (observations are pooled across R and C). Is this picture what you would have expected based on your ANOVA results above? Explain.
Part 7 and 8 below are bonus. I suggest you to do part 8.
In choosing a Latin square design, the experimenters implicitly made a decision to ignore interactions between R, C and T in order to reduce the number of runs required to estimate the various main effects. Classically, interactions between treatments (N and P) and blocking variables (R and C) are completely ignored independently of the choice of design (see the discussion on blocking and randomization in Section 11.6 of the text). In our experiment for example, the interaction between any two of T, R, or C has 25 degrees of freedom. It is clearly not possible to estimate the main effects of these three factors as well as even a single interaction.
In this case, however, we are able to entertain an alternate analysis to our data in which R, C, N, and P are viewed as quantitative, and we restrict our attention to quadratic polynomials in these four variables. If we treat our variables in this way, the interaction space between R and C, say, has only 1 degree of freedom.
What do you observe ? Comment on your results, explaining what exactly this ANOVA table is telling you. What is the null hypothesis here ?