Workshop: Integrating Computing in the Statistics Curricula

Date: July 30 & 31, 2010 preceding the 2010 JSM

Location: Vancouver, British Columbia


Overview | Agenda | Topics | Presentations | Local Arrangements | Registration & Participant Support

Topics for Integrating Computing into the Statistics Curriculum

The workshop will cover many topics. The aim is to mix discussion of

We will assume a basic background in programming, but not for the advanced programming topics. We will not, however, assume any background in some of the more "modern" topics such as Web scraping, interactive graphics, relational databases and will cover the material. Therefore we will introduce the material for these topics as well as approaches to teaching it.

We will provide materials that contain much more detail than we will cover during the two days of the workshop. and these can be used within your class


The following are the collection of possible topics we will cover during the workshop. Which ones we cover will depend on the makeup of the participants.
  • Motivation, Philosophy and Aims of integrating computing into the statistics curricula
    Role of programming, algorithms/computational statistics, modern technologies for working with data,
  • Essentials of Programming
    • Fundamentals
      • reading and manipulatin data types and structures,
      • programming concepts
      • graphics
      • common statistical methods
      • writing functions, debugging, validating.
    • Identifying the concepts, abstractions and comparisons across languages.
  • Advanced programming
    for example,
    • Efficiency, profiling, interfacing to compiled code, parallel computing;
    • Concepts of Object-oriented programming;
    • inter-system interfaces such as shells and calling executables
    • writing software, e.g. R packages
  • Data input and manipulation
    Connections, shell utilities.
  • Text processing
    • Regular expressions
    • Shell utilities
  • Web scraping and Web Services
    Parsing HTML content, XML, REST and HTTP requests
  • Relational Databases
    Accessing data via SQL - Structured Query Language.
  • Graphics
    • Essentials of computational models for creating graphics (e.g. common commands, lattice, ggplot2, grid)
    • formats, vector versus raster graphics,
    • color, composition
    • opportunities for using Web-based, interactive graphics such as Google Earth/Maps, Scalable Vector graphics, JavaScript, ...
  • Leveraging & Developing Case Studies and exercises
    How to find rich, interesting data and design pedagogically valuable case studies that teach both computing topics and also expose the students to statistics and data analysis in action.
  • The role of computational statistics in the curriculum
    • numerical optimization,
    • matrix algebra computations (decompositions, etc.)
    • random number generation
    • markov chain monte carlo (MCMC)
    • bootstrapping
    • Cross Validation
    • computer experiments and simulation
  • Using computing to teach non-standard topics
    We also take the opportunity to introduce methods that the students might not see in regular statistics classes such as k nearest neighbors, naieve Bayes, classification and regression trees, boosting, random forests.
  • Computing Environments and Tools
    Text editors, version control, remote login to compute servers, dynamic documents (Sweave, RWordXML, SWord), ...

  • Duncan Temple Lang & Deborah Nolan <statcur@stat.berkeley.edu>
    Last modified: Sun Apr 25 09:44:59 PDT 2010