##
1 General Information

Most procs in SAS will have a `data=` option to specify which
data set you need to work with. Don't forget that data set options can
be specified as part of this argument. To subset data being used by a proc,
the `where` statement can be used, just as in a data step.
To process data separately by groups, the `by` statement can be
used. The `label` and `format` statements are sometimes
helpful for making the output look the way you want it to.
Many statement in SAS allow optional arguments; these are generally
specified after a slash (`/`) in your SAS program.
When you're trying to produce an output data set from a procedure, don't
overlook the possibility of using the Output Delivery System.
##
2 `proc print`

The simplest procedure in SAS is `proc print`, which displays the
values for some or all variables in a SAS data set. If you don't want
all the variables to be printed, the `var` statement can be used
to provide a list of the variables you want.
By default, `proc print` prints an observation number at the beginning
of each line; to suppress it use the `noobs` option. When more than
one panel is required to display all the variables, SAS uses this observation
number to label the multiple panels. If you'd rather use variables of your
own choosing for this, specify the variables you want to appear on each
panel in the `id` statement.
To use variable labels instead of names as the column headers, specify the
`label` option.
For more control over the way your data set is displayed, you may want to
investigate the `tabulate` and `report` procedures.
##
3 `proc plot` and `proc gplot`

The only required statement for these procedures is the `plot`
statement. Thus to make a scatter plot with `x` on the x-axis,
and `y` on the y-axis, the statements
proc plot data=yourdata;
plot y*x;
run;

could be used. If there's a grouping variable in your data set,
a plot statement like "`plot y * x = group`" will provide separate
points or lines for each level of `group`. To overlay several plots
on the same graph, use the `overlay` option, as in
plot y*x z*x/overlay;

The difference between `plot` and `gplot` is that `proc plot`
produces a low-resolution plot along with all the rest of SAS's output, while
`proc gplot` opens a new window, and displays a high-resolution plot. To
save such a plot, go to `File -> Export As Image` and choose an image type
and name for the saved plot. You can also choose `Edit -> Edit Current Graph`
to change the graph dynamically.
To change the appearance of the points or lines in high-resolution plots, use a
`symbol` statement *before* the `proc gplot` call.
Among the other graphics programs available in SAS are `gchart`, `gcontour`,
`gmap`, `g3d`, and `g3grid`.
##
4 `proc means` and `proc summary`

As their names imply, these procedures produce summary statistics (like means, standard
deviations, etc.). The difference between the two is that proc means always produces
printed output, and produces an output data set by request only, while proc summary
must be instructed to either print or produce an output data set.
Internally, the two programs are the same.
To tell these procedures which statistics they should display, you include keywords on
the `proc` statement. For example, to produce a listing with the mean, standard
error and maximum for variables `x`, `y`, and `z` in data set
`mydata`, you could use
the following statements:
proc means data=mydata mean stderr max;
var x y z;
run;

Notice that those are the only statistics which will be displayed.
One very useful feature of these procedures is that they can do analyses on
groups of data without the need to sort the data set first, by using the
`class` statement instead of the `by` statement. Any variables
on the `class` statement will be included in output data sets that
are created.
To produce an output data set with either procedure, you must use an `output`
statement. Assuming the same data set and variables as the previous example, here
are some sample output statements:
output out=new mean=mx my mz nmiss=nmx nmy nmz; *- produces mean and nmiss for each variable;
output out=new mean=mx my std=sx sy sz; *- only outputs mean for x and y;
output out=new max=; *- outputs maximum using each variable's name;

##
5 `proc univariate`

In earlier versions of SAS, there were several statistics available through
`proc univariate` that were not available
through `proc means`, but this is no longer the case. The primary reason to use
proc univariate is that it provides a graphical view of a variable's distribution through
the `plot` and `boxplot` options. In addition, some users find the layout
of `proc univariates` output to be more useful than the very abbreviated output of
`proc means`.
##
6 `proc freq`

The main purpose of `proc freq` is to produce tabulations.
For each variable listed on the `tables` statement, SAS will produce output
showing how many times each possible value of that variable was found in a data set.
To produce a cross-tabulation (that is, a
table where the rows represent possible values for one variable, the columns represent
possible values for a second variable, and the value in the table represents the number
of times observations with the specified values of the two variables were found in the
data set), separate the two variables names with an asterisk(`*`); if more than
two variables are supplied this way, SAS will produce a series of 2x2 tables.
To see multiple tabulations in a consise form (with one column for each variable, and
another column showing counts), use the `list` option of the tables statement.
Along with the cross-tabulation, a variety of options to the `table` statement will
generate statistical tests regarding the independence of the variables being studied. The
`out=` option can be used with the `table` statement to create an output
data set. Statistics for the individual cells of the table are also generated by default;
a variety of keywords beginning with the letters `no` are available on the
`tables` statement to suppress these additional values.
##
7 `proc reg`

`proc reg` is the basic procedure for performing regression analysis in SAS.
On the `model` statement, you put the name of your dependent variable(s) on
the left-hand side of an equals sign (`=`), and list the independent variables
on the right hand side, separated by spaces. A large number of options are available on the
`model` statement to specify exactly what information about the regression SAS
will compute and display.
The `output` statement allows you to create output data sets with predicted
and/or residual values, while the `outest=` and `outsscp=` clauses of
the `proc reg` statement allow outputting parameter estimates or the
sums of squares and crossproducts matrix, respectively.

File translated from
T_{E}X
by
T_{T}H,
version 3.67.

On 25 Jul 2008, 11:41.