Scatter Plots

This button takes you to a screen that allows you to create scatter plots with a number of extras:

Basic Scatter Plot

To create a scatter plot, choose the X-Variable (the horizontal variable) and the Y-Variable (the vertical variable) from the lists at the right, the press the "Update" button.

The scatter plot will appear in the center. Above is the regression equation in the form

Y-Variable = Intercept + Slope *(X-Variable),
along with the RMSE = Root Mean Square Error, and r = correlation coefficient.

(Be sure the "Blank Plot" box at the upper left is not checked.)

Lines

Along the left, there are four boxes to choose that will cause lines to appear on the scatter plot. The first will draw the Regression (least squares) line, and the second will draw the  SD-Line. The third draws two parallel lines, each one RMSE away from the regression line, that is, it draws Regression Line - RMSE and Regression Line + RMSE. The fourth draws lines two RMSE's from the regression line, at Regression Line - 2*RMSE and Regression Line + 2*RMSE. You can select as many or as few of these boxes as you wish.

The regressogram

In addition to the above lines, a regressogram can be drawn on the plot. A regressogram splits the X-Variable into a number of bins, then for each bin draws a horizontal whose height is the average of the Y-Variable for observations in that bin. The number of bins can be set by typing in a number in the area next to "# bins:" (This number must be between 1 and 100.)

Residuals

Checking the "Residuals" box will pop up a window with a little scatter plot with the X-Variable as the horizontal and the Residuals as the vertical, and a little histogram of the Residuals.

Saving the residuals

To add the residuals to the data set, just press the "Save Residuals" button. A new variable will be created. Its name will be Resid(i.j), where the i is the number of the Y-Variable and j is the number of the X-Variable. Thus if the X-Variable is first in the list of variables, and the Y-Variable is the fourth, then the residuals will be named Resid(4.1).
The residuals saved will be for the original variables, and not reflect any changes you might have made with adding, deleting, or splitting.

Splitting Plots

In the little box of choices next to the "# bins" area, one can choose whether to plot the variables using all the observations, or whether to use just the observations with low values of the splitting variabel, or high values of the splitting variable.In order to make use of this option, a Splitting Variable must have already been chosen.To do that, press the Choose Split button.

Once the splitting variable is set, you have these choices:

  1. "No Split ..." plots all the observations
  2. "Split: >=" plots just the observations whose values on the splitting variable are greater than or equal to the splitting value
  3. "Split: <" plots just the observations whose values on the splitting variable are less than the splitting value
Press "Update" when you have made your selection.

Label/Add/Delete Points

The little choice area just to the right of the "Save Residuals" button indicates what happens when the mouse is used to click on the scatter plot. By hitting "Update," one can undo all the labeling, adding, and deleting and return to the original scatter plot.

Blank Plots

To start with a blank scatter plot, select the "Blank" option at the left, then press "Update." A plot will appear with nothing on it. You can then add points, and delete points after you have added them. Pressing "Update" will blank out the plot again.

Statistics

Checking the little box labeled "Statistics" at the left will pop up the statistics (average, median, SD, minimum, maximum) of the two variables in the existing plot, plus the residuals. If there are added points, or deleted points, these statistics will reflect the changes. Choosing the "Statistics" button at the top of the screen gives the statistics for all the original variables, and does not reflect any adding or deleting done on the plots.