Statistics 135: Concepts of Statistics

Course Description

This course is an introduction to the theory and application of statistical methods. The topics to be covered are fundamental concepts of mathematical statistics, including survey sampling, estimation and hypothesis testing, topics in descriptive statistics and data analysis, with particular emphasis on graphical displays, aspects of experimental design, and a variety of applications. We will cover most of the material in chapters 7-14 of the text. The computer will play a key role in the course, although no prior experience is assumed; the open source statistical package R will be used in labs  to analyze real data sets and conduct simulations. These concrete activities will be a valuable complement to the lectures.

Pre-requisites: Calculus and linear algebra. Statistics 134 or an equivalent course in probability theory

 

Text:  J. A. Rice. Mathematical Statistics and Data Analysis. 3rd Edition.  We will cover most of chapters 7-14.

 

Library reserves: I put two optional books on reserve. I recommend that you check them out sometime during the semester.

Instructor

John Rice
Office:  425 Evans Hall
Phone:  642-6930
Email: rice AT stat.berkeley.edu

url: www.stat.berkeley.edu/~rice
Office Hours: Wed 10-12

GSI

Mike Higgins
Office: 335 Evans
Email: mjh4646 AT stat.berkeley.edu
Office Hours: Monday 3-4, Tuesday 1-2, Thursday 1-2. 307 Evans

 

Lectures

Tu-Th 2:00-3:30. 2 LeConte

I don't allow cell phones or laptops in lecture. However if you wish to use a laptop to take notes, please speak to me.

Lab Sections

Friday 12:00-1:00 and 2:00-3:00 332 Evans. The section meeting will be used to instruct and help with computer assignements and to review course material.

Lab website is on Bspace

Grading

Grades will be based on a midterm, a final exam, homework, and labs.

 

The midterm will count for 25% of your grade; the score on the midterm will be replaced by the score on the final if the latter is higher. There will be no makeup midterms: if you miss the midterm, the score on it will be your score on the final.   The midterm is scheduled for Oct 7.

The final exam will count 40%.  It will be on Monday Dec 15, 12:30-3:30 pm.    There will be no alternative times, so if you can't take the exam at this time, don't take the course.

Homework will be assigned every week and will count 20%.  Your two lowest homework grades will be dropped. Assignments will be posted below. Homework will be collected in class on Thursdays.

There will be several labs; this component of the course will count 15%. They require data analysis using the statistical software R  (see links below). 

 

Academic Honesty

You are encouraged to work together with others on the homework, but you must write up your own solutions.  The same applies to labs -- you must ultimately do your own computing and writing.  So, for example, if a lab assignment involved taking a random sample, your random sample had better not be identical to any other in the class.   No collaboration is allowed on exams.  Cheating will be taken seriously and the penalties will be severe.

Links

Handouts

Demos on sampling and the data

Demo with globe

Bayes demo. I wasn't able to do this in class. I had intended to show the effects of changing the prior parameters, the number of trials and the number of successes

Midterm from 2007 with solutions

Old final exam and solutions

Midterm Solutions. Check the solutions against your answers and if you are still confused go over them with Mike or me.

 

 

Score distribution on the midterm (there were 26 points possible):

 

> stem(mid,scale=2); summary(mid)

The decimal point is at the |

7 | 0
8 | 00
9 |
10 | 00
11 | 00
12 | 000
13 | 0
14 | 0000
15 | 0
16 | 00
17 | 00000
18 | 0
19 | 00000
20 | 000
21 | 0000
22 | 00000000000
23 | 00000
24 | 000000
25 | 0000000000
26 | 000000000

 

Min. 1st Qu. Median Mean 3rd Qu. Max.
7.00 17.00 22.00 20.05 24.00 26.00

 

I see three possible explanations for this skewed distribution: (1) many students studied very hard, (2) I did a great job of teaching, (3) the midterm was too easy. I don't assign letter grades to individual components of the course. At the end of the semester the scores on components are combined in a weighted average and then letter grades are assigned. The midterm counts for 25%.

 

 

Class demos of chi-square tests: geissler.R geissler.txt cont-table.R delinq.txt smokepreg.txt

Class demos of two sample tests: ozone.R ozonecontrol.csv ozonetreat.csv calcium.R calcium.csv

Baseball data shown in class: baseball.R obp_nl.txt

Solutions to final exam

 

Schedule

Week of Aug 27: Review 4.3. 7.1-7.2

Week of Sept 3: 7.2-7.3

Week of Sept 10: 7.3.3; Begin Chapter 8. (I will not cover 7.4-7.5, but read 7.6)

Week of Sept 17: 8.1 - 8.5

Week of Sept 24: Finish 8.5; start 8.6

Week of Oct 1:

Week of Oct 6: midterm; 9.1-9.2

Week of Oct 13: 9.3 - 9.5

Week of Oct 20: 9.5; 13.1-13.4

Week of Oct 27: 11.1-11.3

Week of Nov 3: 11.3-11.4

Week of Nov 10: begin chapter 14

Week of Nov 17:

Week of Nov 24:

Week of Dec 1:

Homework

Homework will be due in class on Thursday unless otherwise specified. Late assignments will not be accepted. The list of homework assignments and due dates follows.  Show your work.

Sept 4: no homework

Sept 11: Chapter 7: 2,4,8,11,28,32

Sept 18: Chapter 7: 14, 16,18,24,34. Chapter 8: 4ac, 7ab

Sept 25: Chapter 8: 4d, 11, 14, 16ab, 18abc, 24, 28, 30

Oct 2: Chapter 8: 4e, 25, 29, 58abc, 63, 64

Oct 9: no assignment

Oct 16: Chapter 9: 4, 6,18,20

Oct 23: Chapter 9: 24, 26, 30, 33, 38

Oct 30: Chapter 9: 40; Chapter 13: 2, 6, 24abc, 28

Nov 6: Chapter 11: 8, 16, 18, 24, 39

Nov 13: Chapter 11: 23ab, 34, 45b, 47c (in these two problems you are not required to analyze the data), 52bcgh

Nov 20: Chapter 14: 1, 10, 11, 12, 15, 22, 23, 25

Dec 2 (Tuesday): Chapter 14: 4,6,8, 16a, 19

Dec 9: (Tuesday): Chapter 14: 26, 27,30,32