Syllabus for Statistics 243

Introduction to Statistical Computing
Fall, 2010
Class Web Page: http://www.stat.berkeley.edu/classes/s243/
        ¯                                                    ¯
 Introduction (1 lecture)
 Basic Unix Commands (2 lectures)
 C Programming Language (8-10 lectures)
 Algorithms for Mean and Variance (1 lecture)
 Random Number Generation (3 lectures)
 Matrix Storage and Operations (3 lectures)
 Regression Calculations (2 lectures)
 Matrix Decompositions (4 lectures)
 R Programming Language (5 lectures)
 Minimization Methods (3 lectures)
 Non-linear Regression (2 lectures)
Any remaining lectures are spent on special topics such as program maintenance, object oriented programming, and advanced UNIX programming techniques.
The grade for this course is based on four computer projects which will be assigned throughout the semester. If you feel you have a project which would be more relevant than one of the assignments, feel free to suggest it as an alternative to an assignment given in class.
All students will be provided with a computer account on the Statistical Computer Facilities (SCF) network of SUN, Linux and Mac computers. The computer room in 342 Evans provides iMacs, and computers are also available in Room 432 Evans; you can also remotely log in to the SCF system from other campus computers or from home. If you wish to do your assignments on some other computer, keep in mind that required programs may be stored on the SCF system, and it is your responsibility to get the programs to another computer. Additionally, some of the assignments are oriented towards the UNIX operating system, so if you wish to use a non-UNIX computer, you should make sure that the necessary resources are available.
None of the following texts are required, but interested students may want to consider the following books, not only for this course, but as a useful part of their professional libraries:
  1. Gentle, James E.: Numerical Linear Algebra for Applications in Statistics, Springer, New York(1998)
  2. Gentle, James E.: Random Number Generation and Monte Carlo Methods, Springer, New York(1998)
  3. Kernighan, Brian W. & Pike, Rob: The UNIX Programming Environment, Prentice Hall, New Jersey(1984)
  4. Kernighan, Brian W. & Pike, Rob: The Practice of Programming, Addison-Wesley, Reading(1999)
  5. Kernighan, Brian W. & Ritchie, Dennis M.: The C Programming Language, Second Edition, Prentice Hall, New Jersey(1988)
  6. Kennedy, William J. & Gentle, James E.: Statistical Computing, Marcel Dekker, New York (1980)
  7. Thisted, Ronald A.: Elements of Statistical Computing, Chapman and Hall, New York (1988)
                                                                                                 Phil Spector
                                                                                                 Evans 495
                                                                                                 email: spector@stat




Guidelines for Assignments


The grade for this course is determined by four computer projects related to the material covered in class. Some of the assignments may seem deceptively easy, but try to avoid putting the assignments off until the last possible minute. One of the most important things to learn about programming is that it is an unpredictable venture, and a simple task often takes more time than you would think at first glance. The purpose of the assignments is to give you an opportunity to write real programs which solve real problems. Your goal should not be to simply put together a program which gets the right answer for a particular set of data, but to develop a programming style which will allow you to be comfortable in solving problems which you will encounter in your future work.
You may find it useful to use a word processing program like LaTeX when writing your reports, but this is not required. If you have an interest in learning how to produce attractive electronically typeset documents, this may be a good time to learn, but the focus of these assignments is not to produce a pretty report.
Each assignment should consist of the following sections:
  1. An introduction, explaining in your own words what the goal of the program is, and a brief overview of your strategy in solving the problem. In other words, this first section should outline the reasoning you used as you figured out how to get the assignment completed.
  2. You should include in your assignments the complete source code of the program which you wrote to solve to the problem.
  3. Please provide a copy of the actual output of your program, as well as a copy of any input data, or a description of the data if it is very large or provided as part of the assignment. If there are parts of the output which are not self-explanatory, please be sure to annotate them so I can figure out what you are doing.
  4. Each assignment should contain a conclusion, which answers any specific questions raised in the assignment, as well as reporting on any interesting findings which you made while you were working on the assignment. If you feel you've encountered a principle or concept which has helped you understand things better, please don't hesitate to mention it, both for your own clarification, and so that I can get a better idea of how you are approaching the tasks at hand.
  5. The prefered method for submitting your assignments is to email me (at spector@stat.berkeley.edu), with a clear indication in the subject line that you are submitting an assignment for Stat 243. Your submission should have a PDF, OpenOffice document, or Word file containing your report. If necessary, you can include a single archive (zip, rar, tar, etc.) containing any other files which you feel are relevant.
Your report need not be in any standardized format, but all of the above information should be included, and you may find it convenient to organize your work into the four sections described above.



File translated from TEX by TTH, version 3.67.
On 26 Aug 2010, 11:54.