STAT 2, SPRING 2006
A. Adhikari
ANSWERS TO REGRESSION REVIEW FOR QUIZ 2


Your answers may be a bit different due to rounding. That's OK.

1a) 81.6th (or 81st, or 82nd).  The z for the VSAT is 1.3 so the z for the estimated MSAT is 0.7x1.3 = 0.91.
1b) 27.42th (or 27th, or 28th).  The z for the VSAT is -0.85 so the z for the estimated MSAT is 0.7x(-0.85) = 0.595.
 
2a) (i) by the regression effect.  On average, the people on the 70th percentile of VSAT scores will be somewhere strictly between the 50th and 70th percentiles of the MSAT scores.  The student in the problem has an MSAT score on the 70th percentile, which is above that range.
2b) (iv).  This time, the regression effect is not enough - we need more information.  We know that on average the people on the 70th percentile of VSAT scores will be somewhere between the 50th and 70th percentiles of the MSAT scores.  Exactly where in that range depends on r, which we don't have.  So we can't say how the average MSAT score of that group compares to the 60th percentile.
2c) (i) by the regression effect. On average, people on the 25th percentile of VSAT scores will be somewhere strictly between the 25th and 50th percentiles of the MSAT scores.  So on average on the MSAT they will be above the 25th percentile. That will be our estimate for the student.

3a) (iii) because for "height=42 inches" the regression line is at "weight = 38.92 pounds".
3b) 40, 1.52 (rms error; you're looking only at the central strip of the scatter diagram).
3c) 38.38, 1.52.
3d) 14.69%. (Any answer between 14% and 15% is OK.)  You found the average and SD of these children's weights in  (c).  So 40 becomes a z of about 1.07.  I used 1.05 on the table. 
3e) 50%.  Compare this to (d).  The kids in (d) are below average in height, so a smaller percent of them are above the overall average weight.
3f) 1.98 (The answer is "z  r.m.s. errors", where you find the z corresponding to a central area of 80%.  Hence 1.3 x 1.52.)
3g) 1.98.  The statement is equivalent to the one in (f).
3h) 78.87%.  This does not involve regression, because there's only one variable in the problem!!  The variable "height" never appears, so the two variables don't have to be linked, so you don't have to use r in any way.  This problem could have been done in Chapter 5.  Weights are normal with average 40 and SD 2.  You want the area in the range 40 +- 2.5.  That's the area between -z = -1.25 and z = 1.25.
3i) 90%. This problem is about errors in the regression estimate.  These are measured by the rms error of regression.  Now 2.5 is equal to 1.645 rms errors (do 2.5/1.52), and 1.645 r.m.s. errors on either side of the regression line picks up about 90% of the scatter.
3j) 10%. This is the essentially the same problem as (i); you're just looking at what's beyond 1.645 r.m.s. errors on either side of the regression line.

Note: As you have probably noticed, 3f and 3i are really the same problem.  So are 3g and 3j.  In all four, the crucial question is "How many rms errors on either side of the regression line?"  The answer to that question is z