Tips for Better SAS Programming
-
Develop a personal programming style, with respect to programming
techniques (loops, arrays, informats, formats, etc.) as well as
comments, indentation, step boundaries, etc. Try to index programs based
on what you're doing so they can be reused. Use comments when you're
doing something particularly clever or difficult.
-
Don't reinvent the wheel. Make sure you have access to the manuals and
books that you need. Learn about sources for sample programs,
macros and advice. Check out the WWW resources
that are available. Try to find the local SAS guru if there is one.
If you can't find a way to do something, or the way you're trying
is not satisfactory for some reason, ask for help.
-
Never hesitate to experiment with SAS - create a simulated data set
or use data from the sample programs if you don't have your own.
Use the obs= data set or system option or randomly sample your
data to test out ideas without taking up too much time.
-
Learn to make SAS talk and listen to other applications. Make sure
you know something about editing, file manipulation and file transfer
on your computer system. Don't be afraid to learn other programs if
they will complement your SAS programming.
-
Review your programs when they're complete to see if
-
data steps can be combined or eliminated
-
a different procedure might be used
-
unneccessary sorts are being done
-
Consider the following when you're working with a data set:
-
Use the drop= or keep= data set options if you don't need all the variables.
-
Try to use the where clause in procedures, instead of creating extra data
sets through subsetting ifs.
-
Use subsetting ifs early in data steps which do a lot of processing.
-
As space permits, store clean versions of datasets which you're
going to be using a lot. But remember that there's no difference
between a temporary and permanent data set (unless they happen to
be located on disks which have very different access times.)
-
Use the class statement when feasible to avoid unneccessary sorting.
Similarly, use the by statement preceded by a sort for very large
data sets with lots of levels.
Use indexed data sets if you're going to use lots of by statements or
where clauses involving a particular variable or variables.
-
Try to keep up-to-date about new developments and procedures - the
documentation (even the on-line documentation) lags way behind.
Phil Spector (link to my home page)
spector@stat.berkeley.edu