STAT 2, SPRING 2006
A. Adhikari
ANSWERS TO REGRESSION REVIEW FOR QUIZ 2
Your answers may be a bit different due to rounding. That's OK.
1a) 81.6th (or 81st, or 82nd). The z for the VSAT is 1.3 so the z
for the estimated MSAT is 0.7x1.3 = 0.91.
1b) 27.42th (or 27th, or 28th). The z for the VSAT is -0.85 so
the z for the estimated MSAT is 0.7x(-0.85) = 0.595.
2a) (i) by the regression effect. On average, the people on the
70th percentile of VSAT scores will be somewhere strictly between the
50th and 70th percentiles of the MSAT scores. The student in the
problem has an MSAT score on the 70th percentile, which is above that
range.
2b) (iv). This time, the regression effect is not enough - we
need more information. We know that on average the people on the
70th percentile of VSAT scores will be somewhere between the 50th and
70th percentiles of the MSAT scores. Exactly where in that range
depends on r, which we don't have. So we can't say how the
average MSAT score of that group compares to the 60th percentile.
2c) (i) by the regression effect. On average, people on the 25th
percentile of VSAT scores will be somewhere strictly between the 25th
and 50th percentiles of the MSAT scores. So on average on the
MSAT they will be above the 25th percentile. That will be our estimate
for the student.
3a) (iii) because for "height=42 inches" the regression line is at
"weight = 38.92 pounds".
3b) 40, 1.52 (rms error; you're looking only at the central strip of
the scatter diagram).
3c) 38.38, 1.52.
3d) 14.69%. (Any answer between 14% and 15% is OK.) You found the
average and SD of these children's weights in (c). So 40
becomes a z of about 1.07. I used 1.05 on the table.
3e) 50%. Compare this to (d). The kids in (d) are below
average in height, so a smaller percent of them are above the overall
average weight.
3f) 1.98 (The answer is "z r.m.s. errors", where you find the z
corresponding to a central area of 80%. Hence 1.3 x 1.52.)
3g) 1.98.
The statement is equivalent to the one in (f).
3h) 78.87%. This does not
involve regression, because there's only one variable in the problem!!
The variable "height" never appears, so the two variables don't have to
be linked, so you don't have to use r in any way. This problem
could have been done in Chapter 5. Weights are normal with
average 40 and SD 2. You want the area in the range 40 +-
2.5. That's the area between -z = -1.25 and z = 1.25.
3i) 90%. This problem is about errors in the regression estimate.
These are measured by the rms error of regression. Now 2.5 is
equal to 1.645 rms errors (do 2.5/1.52), and 1.645 r.m.s. errors on
either side of the regression line picks up about 90% of the scatter.
3j) 10%. This is the essentially the same problem as (i); you're just
looking at what's beyond 1.645 r.m.s. errors on either side of the
regression line.
Note: As you have probably noticed, 3f and 3i are really the same
problem. So are 3g and 3j. In all four, the crucial
question is "How many rms errors on either side of the regression
line?" The answer to that question is z!