Workshop: Integrating Computing in the Statistics Curricula
Date: July 30 & 31, 2010
preceding the 2010 JSM
Location: Vancouver, British Columbia
Overview |
Agenda |
Topics |
Presentations |
Local Arrangements |
Registration & Participant Support
Topics for Integrating Computing into the Statistics Curriculum
The workshop will cover many topics.
The aim is to mix discussion of
- the technical material to be taught, and
- how to teach it
We will assume a basic background in programming, but not for the
advanced programming topics. We will not, however, assume any
background in some of the more "modern" topics such as Web scraping,
interactive graphics, relational databases and will cover the
material. Therefore we will introduce the material for these topics
as well as approaches to teaching it.
We will provide materials that contain much more detail
than we will cover during the two days of the workshop.
and these can be used within your class
The following are the collection of possible topics we will
cover during the workshop. Which ones we cover will depend
on the makeup of the participants.
-
- Motivation, Philosophy and Aims of integrating computing into the
statistics curricula
- Role of programming, algorithms/computational statistics, modern technologies for
working with data,
-
- Essentials of Programming
-
- Fundamentals
- reading and manipulatin data types and structures,
- programming concepts
- graphics
- common statistical methods
- writing functions, debugging, validating.
- Identifying the concepts, abstractions and comparisons across languages.
-
- Advanced programming
- for example,
- Efficiency, profiling, interfacing to compiled code, parallel computing;
- Concepts of Object-oriented programming;
- inter-system interfaces such as shells and calling
executables
- writing software, e.g. R packages
-
- Data input and manipulation
- Connections, shell utilities.
-
- Text processing
-
- Regular expressions
- Shell utilities
-
- Web scraping and Web Services
- Parsing HTML content, XML, REST and HTTP requests
-
- Relational Databases
- Accessing data via SQL - Structured Query Language.
-
- Graphics
-
- Essentials of computational models for creating graphics
(e.g. common commands, lattice, ggplot2, grid)
- formats, vector versus raster graphics,
- color, composition
- opportunities for using Web-based, interactive graphics
such as Google Earth/Maps, Scalable Vector graphics,
JavaScript, ...
-
- Leveraging & Developing Case Studies and exercises
- How to find rich, interesting data and design pedagogically
valuable case studies that teach both computing topics and
also expose the students to statistics and data analysis in action.
-
- The role of computational statistics in the curriculum
-
- numerical optimization,
- matrix algebra computations (decompositions, etc.)
- random number generation
- markov chain monte carlo (MCMC)
- bootstrapping
- Cross Validation
- computer experiments and simulation
-
- Using computing to teach non-standard topics
-
We also take the opportunity to introduce methods that the students
might not see in regular statistics classes such as k nearest neighbors, naieve
Bayes, classification and regression trees, boosting, random
forests.
-
- Computing Environments and Tools
-
Text editors, version control, remote login to compute
servers, dynamic documents (Sweave, RWordXML, SWord), ...
Duncan Temple
Lang & Deborah Nolan
<statcur@stat.berkeley.edu>
Last modified: Sun Apr 25 09:44:59 PDT 2010