Statistics 215B: Applied Statistics. Spring 2012
- Instructor: P.B. Stark, stark [AT] stat [DOT] berkeley [DOT] edu
Office Hours: Tuesdays, 11am–12pm, 403 Evans Hall
- GSI: Yuval Benjamini, yuvalb [AT] stat [DOT] berkeley [DOT] edu
Office Hours: Fridays 10am–12pm, 332 Evans Hall
- Meets: Tuesday, Thursday 9:30-11am, 332 Evans Hall
- Texts: See reading list below
Course format:
3 hours of lecture per week, divided between discussing particular applications and papers, and
presenting theory and methodology.
There will be written assignments roughly every two weeks, and a term project that includes
a written report and an oral class presentation.
I hope that term projects will lead to publishable research: Bring your favorite data or favorite scientific
problem.
The written assignments will largely be drawn from Freedman's book
Statistical Models: Theory and Practice (2009 revised edition).
I will not be lecturing on all the chapters from which I assign problems: I expect students
to read and digest the material on their own, but I am happy to answer questions
in class or in office hours, and if something turns out to be a stumbling block for more
than a few students, I will lecture on it.
I plan to reserve most of the lecture time to talk about particular applications and case studies.
List of pervasive themes:
-
Making sense of probability in applications
- when the experiment creates the probability (randomization, instrumental error, etc.)
- when the scientific theory includes a random component (e.g., cosmology)
- when the analyst pretends (statistical models, in general; earthquake prediction)
- when the probability model is postulated just to evaluate plausibility
-
Cultures of different applied disciplines
- geophysics
- cosmology
- helioseismology
- litigation
- elections
- Bayesian versus frequentist leanings in different disciplines
- coherent and incoherent analyses
- attention to implicit and explicit assumptions
- statistics: tool, incantation, or fauxphisication?
-
Solving real problems versus applying methods to data
- What's the big picture? (requires learning some science)
- Data quality, data quality, data quality
- The (lazy) tendency to classify problems by data type
- Choosing a good question (requires learning some science)
- Helping design experiments (requires learning some science)
- Designing methods to fit the problem: standard ≠ appropriate (requires learning some science)
- You can't always get what you (or your collaborators) want
-
Model selection, model choice
- What's the goal? Prediction? Estimation? Adjustment? Inference?
- Occam's Razor versus The Ostrich Principle
- Post-selection inference about model parameters. Meaning, methods, and madness
-
Test selection and its perils
-
Uncertainty quantification
-
Testing nonparametric hypotheses
-
Causal inference: randomization, the Neymann model, regression adjustments to experimental data, response schedules
-
Path models
-
[possible] Hierarchical linear models
List of applications (preliminary, time permitting):
-
Geophysics:
- Earthquake prediction, hazard maps, clustering
- Seismic structure of Earth, bumps on the core-mantle boundary
- Correlation of the geoid and magnetic field
-
Astrophysics
- Microwave cosmology
- using supernovae to measure the expansion of the universe
-
Voting
- Signature verification
- Election auditing
-
Medical research
- Placebos and active placebos
- Voodoo correlation
-
Litigation
- Sampling (in wage and hour and consumer class actions, and other)
- Damage models
-
Education, Sociology, Economics
- The effect of Catholic Schools
- Modeling credit risk
Techniques and tools likely to be discussed
- AIC, BIC, Mallows Cp, Minimum Description Length
- Confidence sets, tests, and the duality between them
- Constraints versus priors in scientific problems
- Credible regions and their connection to confidence sets
- Inverse problems
- Linear models and least-squares
- Logit and Probit models
- Maximum likelihood
- Nonparametric inference about the mean of a restricted population
- Optimization in infinite-dimensional spaces
- Path Models
- Permutation tests, the 2-sample problem, Fisher's Exact test and generalizations
- Prediction intervals and tolerance intervals, nonparametric and Gaussian
- Randomization
- Sampling (simple, with replacement, stratified, cluster, proportional-to-size, multi-stage;
ratio estimates, confidence intervals, tests)
Reading list (preliminary)
-
Angell, M., 2011.
The Epidemic of Mental Illness: Why?,
The New York Review of Books
http://www.nybooks.com/articles/archives/2011/jun/23/epidemic-mental-illness-why/?pagination=false
-
Angell, M., 2011.
The Illusions of Psychiatry,
The New York Review of Books
http://www.nybooks.com/articles/archives/2011/jul/14/illusions-of-psychiatry/?pagination=false
-
Barron, A., J. Rissanan, and B. Yu, 1998.
The Minimum Description Length Principle in Coding and Modeling,
IEEE Trans. Info. Th., 44, 2743–2760.
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=720554
-
Benjamini, Y. and Y. Hochberg, 1995.
Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,
Journal of the Royal Statistical Society B, 57, 289–300.
http://www.jstor.org/stable/2346101
-
Berk, R., L. Brown, E. George, E. Pitkin, M. Traskin, K. Zhang, and L. Zhao, 2012.
What You Can Learn From Wrong Causal Models.
- Berk, R., L. Brown, A. Buja, K. Zhang, and L. Zhao, 2011.
Valid Post-Selection Inference,
stat.wharton.upenn.edu/~buja/PoSI.pdf
- Berk, R., L. Brown, and L. Zhao, 2009.
Statistical Inference After Model Selection,
J. Quant Criminol DOI 10.1007/s10940-009-9077-7.
http://statistics.wharton.upenn.edu/documents/research/BerkBrownZhao2.pdf
-
Chakraborti, S. and J. Li, 2007.
Confidence Interval Estimation of a Normal Percentile,
The American Statistician, 61, 331–336.
http://dx.doi.org/10.1198/000313007X244457
-
Chamberlain, G., 1982.
Multivariate regression models for panel data,
J. Econometrics, 18, 5–46.
http://pdn.sciencedirect.com/science?_ob=MiamiImageURL&_cid=271689&_user=4420&_pii=030440768290094X&_check=y&_origin=search&_zone=rslt_list_item&_coverDate=1982-01-31&wchp=dGLbVlV-zSkzk&md5=2f1c5dc4376cd27f7cfd359c1eeec0c0/1-s2.0-030440768290094X-main.pdf
-
Cousins, R.D., 2011.
Negatively Biased Relevant Subsets Induced by the Most-Powerful
One-Sided Upper Confidence Limits for a Bounded Physical Parameter,
http://arxiv.org/abs/1109.2023
-
Eckhardt, D.H., 1984. Correlations Between Global Features of Terrestrial Fields,
Math. Geol., 16, 155–171.
http://www.springerlink.com/content/jw023j7157806hn4/fulltext.pdf
-
Federal Judicial Center, 2000.
Reference Manual on Scientific Evidence.
www.fjc.gov/public/pdf.nsf/lookup/sciman00.pdf/$file/sciman00.pdf
Reference Guide on Statistics, David H. Kaye & David A. Freedman;
Reference Guide on Survey Research, Shari Seidman Diamond
-
Field, E.H., K.R. Milner, and the 2007 Working Group on California Earthquake Probabilities,
2008. Forecasting California's Earthquakes—What Can We Expect in the Next 30 Years?
http://pubs.usgs.gov/fs/2008/3027/fs2008-3027.pdf
-
Freedman, D.A., 2009. Statistical Models, Theory and Practice, Cambridge.
http://www.amazon.com/Statistical-Models-Practice-David-Freedman/dp/0521743850/
- Observational Studies and Experiments
- Path Models
- Maximum Likelihood
- Freedman, D.A., 2009. Statistical Models and Causal Inference: A Dialogue with the Social Sciences,
Cambridge.
http://www.amazon.com/Statistical-Models-Causal-Inference-Dialogue/dp/0521123909/
- Issues in the Foundations of Statistics: Probability and Statistical Models
- Statistical Assumptions as Empirical Commitments
- What is the Chance of an Earthquake?
- Survival Analysis: An Epidemiological Hazard?
- On Regression Adjustments in Experiments with Several Treatments
- Randomization Does not Justify Logistic Regression
- Diagnostics Cannot Have Much Power Against General Alternatives
- On Types of Scientific Inquiry: The Role of Qualitative Reasoning
-
Geller, R.J., 2011. Shake-up time for Japanese seismology,
Nature, 472, 407–409. doi:10.1038/nature10105
http://www.nature.com/nature/journal/v472/n7344/full/nature10105.html
-
Golomb, B.A., L.C. Erickson, S. Koperski, D. Sack, M. Enkin, and J. Howick, 2010.
What's in Placebos: Who Knows? Analysis of Randomized, Controlled Trials
Ann. Intern. Med., 153, 532–535.
http://www.annals.org/content/153/8/532.abstract
-
Hansen, M.H., and B. Yu, 2001.
Model Selection and the Principle of Minimum Description Length.
J. Am. Stat. Assoc., 96(454), 746–774.
doi:10.1198/016214501753168398.
http://pubs.amstat.org/doi/pdf/10.1198/016214501753168398
- Hide, R. and S.R.C. Malin, 1970.
Novel correlations between global features of the Earth's gravitational and magnetic fields,
Nature, 225, 605–609.
http://www.nature.com/nature/journal/v225/n5233/pdf/225605a0.pdf
-
Jönrup, H. and B. Rennermalm, 1976.
Regression analysis in samples from finite populations,
Scandinavian J. Statistics, 3, 33–36.
http://www.jstor.org/stable/4615605
-
Kaptchuk, T.J., W.B. Stason, R.B. Davis, A.T.R. Legedza, R.N. Schnyer, C.E. Kerr,
D.A. Stone, B.H. Nam, I. Kirsch, and R.H. Goldman, 2006.
Sham device v inert pill: randomised controlled trial of two placebo treatments,
British Medical Journal, doi:10.1136/bmj.38726.603310.55
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1370970/pdf/bmj33200391.pdf
-
Lindeman, M. and P.B. Stark, 2011.
A Gentle Introduction to Risk-Limiting Audits.
http://statistics.berkeley.edu/~stark/Preprints/gentle11.pdf
-
Loredo, T., 1994.
The return of the prodical: Bayesian inference in astrophysics.
http://www.astro.cornell.edu/staff/loredo/bayes/return.pdf
-
McCormick, D., D.H. Bor, S. Woolhandler and D.U. Himmelstein, 2012.
Giving Office-Based Physicians Electronic Access To Patients' Prior Imaging
And Lab Results Did Not Deter Ordering Of Tests,
Health Affairs, 31, 488–496.
http://content.healthaffairs.org/content/31/3/488.full.pdf+html
NY Times article about the study:
http://www.nytimes.com/2012/03/06/business/digital-records-may-not-cut-health-costs-study-cautions.html?_r=1
-
Miratrix, L.W., J.S. Sekhon, and B. Yu, 2012.
Adjusting Treatment Effect Estimates by Post-Stratification in Randomized Experiments,
http://sekhon.berkeley.edu/papers/postadjustment.pdf
-
Morelli, A., and A.M. Dziewonski, 1987. Topography of the core-mantle boundary and lateral
homogeneity of the liquid core,
Nature, 325, 678–683.
http://www.nature.com/nature/journal/v325/n6106/pdf/325678a0.pdf
-
Moseley, J.B., K. O'Malley, N.J. Petersen, T.J. Menke, B.A. Brody, D.H. Kuykendall,
J.C. Hollingsworth, C.M. Ashton, and N.P. Wray,
2002.
A Controlled Trial of Arthroscopic Surgery for Osteoarthritis of the Knee
New Engl. J. Med., 347(2), 81–88.
http://www.nejm.org/doi/pdf/10.1056/NEJMoa013259
-
Noymer, A., A. Penner, and A. Saperstein, 2011.
Cause of death affects racial classification on death certificates.
PLoS One 6(1):e15812
https://webfiles.uci.edu/noymer/web/journal.pone.0015812.pdf
-
Pan, An, Qi Sun, A.M. Bernstein, M.B. Schulze,
J.E. Manson, M.J. Stampfer, W.C. Willett, and F.B. Hu, 2012.
Red Meat Consumption and Mortality: Results From 2 Prospective Cohort Studies
Archives of Intern Med. Published online March 12, 2012. doi:10.1001/archinternmed.2011.2287
http://archinte.ama-assn.org/cgi/content/full/archinternmed.2011.2287
Also news reports of the findings:
http://www.latimes.com/health/boostershots/la-heb-red-meat-why-bad-20120314,0,181706.story
http://www.reuters.com/article/2012/03/14/us-health-redmeat-idUSBRE82C1AT20120314
-
Peck, A.J., 2012.
Decision and Order in MONIQUE DA SILVA MOORE, et al., v. PUBLICIS GROUPE & MSL GROUP,
11 Civ. 1279 (ALC) (AJP).
http://www.mofo.com/files/Uploads/Images/120301-First-Ever-Court-Decision-on-Predictive-Coding-Attachment.pdf
-
Penner, A.M. and A. Saperstein. 2008.
How social status shapes race.
Proceedings of the National Academy of Sciences, 105, 19,628–19,630.
http://www.socsci.uci.edu/~penner/media/pnas.pdf
-
Pulliam, R.J. and P.B. Stark, 1993.
Bumps on the Core-Mantle Boundary: Are they facts or artifacts?
J. Geophysical Res., 98, 1943–1956.
http://www.agu.org/journals/jb/v098/iB02/92JB02692/92JB02692.pdf
-
Schafer, J.P., 2011.
An exact multiple comparsions test for a multinomial distribution.
British J. Math. Stat. Psych., 24(2), 267–272.
DOI: 10.1111/j.2044-8317.1971.tb00471.x
http://onlinelibrary.wiley.com/doi/10.1111/j.2044-8317.1971.tb00471.x/pdf.
-
Shearer, P.M., and P.B. Stark, 2011. The global risk of big earthquakes has not recently increased,
Proc. Nat. Acad. Sci., DOI 10.1073/pnas.1118525109,
http://www.pnas.org/content/early/2011/12/12/1118525109.full.pdf+html
-
Smoot, G.F., C.L. Bennett, A. Kogut, E.L. Wright,
J. Aymon, N.W. Boggess, E.S. Cheng, G. De Amici,
S. Gulkis, M.G. Hauser, G. Hinshaw, C. Lineweaver,
K. Lowenstein, P.D. Jackson, M. Janssen, E. Kaita,
T. Kelsall, P. Keegstra, P. Lubin, J. Mather,
S.S. Meyer, S.H. Moseley, T. Murdock, L. Rokke,
R.F. Silverberg,
L. Tenorio, R. Weiss, and D.T. Wilkinson, 1992.
Structure in the COBE DMR First Year Maps,
Astroph. J., 396, L1.
http://adsabs.harvard.edu/cgi-bin/nph-data_query?bibcode=1992ApJ...396L...1S&link_type=ARTICLE&db_key=AST&high=
-
Stark, P.B., and N.W. Hengartner, 1993.
Reproducing Earth's Kernel: Uncertainty of the shape of the
Core-Mantle Boundary from PKP and PcP Travel Times,
J. Geophys. Res., 98, 1957–1972.
http://www.agu.org/journals/jb/v098/iB02/92JB02071/92JB02071.pdf
-
Stark, P.B., 1993.
Uncertainty of the COBE Quadrupole Detection,
Astroph. J. Lett., 408, L73.
http://adsabs.harvard.edu/cgi-bin/nph-data_query?bibcode=1993ApJ...408L..73S&link_type=ARTICLE&db_key=AST&high=
-
Stark, P.B., 2008. The effectiveness of Internet content filters,
I/S: A Journal of Law and Policy for the Information Society, 4, 411–429. Preprint:
http://statistics.berkeley.edu/ stark/Preprints/filter07.pdf
-
Stark, P.B., 2009. Risk-limiting post-election audits:
P-values from common probability inequalities. IEEE Transactions on Information Forensics and Security,
4, 1005–1014.
http://statistics.berkeley.edu/~stark/Preprints/pvalues09.pdf
-
Stark, P.B., 2012. Constraints versus Priors.
http://statistics.berkeley.edu/~stark/Preprints/constraintsPriors12.pdf
-
Stein, S., R.J. Geller, and M. Liu, 2011.
Why Earthquake Hazard Maps Often Fail and What To Do About It,
http://www.earth.northwestern.edu/people/seth/Texts/mapfailure.pdf
-
Tenorio, L. and P.B. Stark and C.H. Lineweaver, 1999.
Bigger uncertainties and the Big Bang,
Inverse Problems, 15, 329–341.
http://iopscience.iop.org/0266-5611/15/1/029/pdf/0266-5611_15_1_029.pdf
-
U.S. Geological Survey, 2008. 2008 Bay Area Earthquake Probabilities.
http://earthquake.usgs.gov/regional/nca/ucerf/
-
U.S. Court of Appeals, Seventh Circuit, 2011.
Opinion in Nos. 11-1382, 11-1492 ATA AIRLINES, INC., Plaintiff-Appellee, Cross-Appellant, v.
FEDERAL EXPRESS CORPORATION, Defendant-Appellant, Cross-Appellee.
http://docs.justia.com/cases/federal/appellate-courts/ca7/11-1382/11-1382-2011-12-27-opinion-2011-12-27.pdf
-
Vul, Edward, Christine Harris, Piotr Winkielman, and Harold Pashler, 2009.
Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social
Cognition,
Perspectives on Psychological Science, 4(3), 274–290.
http://www.edvul.com/pdf/VulHarrisWinkielmanPashler-PPS-2009.pdf
(Also Scientific American article:
http://www.scientificamerican.com/article.cfm?id=brain-scan-results-overstated)
-
White, P.D., K.A. Goldsmith, A.L. Johnson, L. Potts, R. Walwyn, J.C. DeCesare, H.L. Baber, M. Burgess,
L.V. Clark, D.L. Cox, J. Bavinton, B.J. Angus, G. Murphy, M. Murphy, H. O'Dowd, D. Wilks, P. McCrone, T. Chalder,
and M. Sharpe, 2011.
Comparison of adaptive pacing therapy, cognitive behaviour therapy, graded exercise therapy, and specialist medical care
for chronic fatigue syndrome (PACE): a randomised trial
The Lancet, DOI:10.1016/S0140-6736(11)60096-2.
http://esme-eu.com/getfile.php/Files/PACE-Trial-MRC-DWP%5B1%5D.pdf
Assignments
-
Read Freedman, Statistical Models: Theory and Practice (SMTP), Chapters 1–4;
Freedman, Statistical Models and Causal Inference: A Dialogue with the Social Sciences (SMCI), Chapters 1, 8;
(chapter 8 is also here:
href="http://statistics.berkeley.edu/~stark/Preprints/611.pdf)
Shearer & Stark, 2011.
[Due 1/26 in class]
Freedman, SMTP, problems 4.B.7, 4.B.8, 4.B.11, 4.5.3, 4.5.5, 4.5.6, 4.5.10, 4.5.11.
-
[Due 2/2 in class. Relates to the climate change paper we discussed in class on 1/17.]
-
Consider a random walk with n=137 steps, constructed as follows:
X(0) = 0.
[X(i) - X(i-1)], i = 1, … 136, are IID,
and take the value +1 or -1 with probability 1/2 each.
You will test the hypothesis that a=0 on the assumption that the data (or subsets of the data)
come from the normal linear model
X(i) = ai + b + εi, where the errors {εi}
are IID N(0, σ2), with σ2 unknown (to be estimated from the data),
based on fitting the model by OLS.
-
(a) By simulation, estimate the actual significance level of a nominal 5% test of the hypothesis a = 0.
That is, estimate how often OLS estimate of the slope a is "statistically significant at level 5%" when the
significance calculation assumes that the normal linear model is true.
Justify your choice of the number of replications in the simulation.
-
(b) By simulation, estimate the chance that the sign of the slope of the line fitted (by OLS) to the last 58 points
in the series differs from the slope of the line fitted (by OLS) to the entire series of 137 points.
Justify your choice of the number of replications.
-
(c) By simulation, estimate the chance that the sign of the slope of the line fitted (by OLS) to the last 58 points
in the series differs from the slope of the line fitted (by OLS) to the entire series of 137 points,
and that both estimated slopes are statistically significant at level 5%.
Justify your choice of the number of replications.
-
(d) By simulation, estimate the chance that the sign of the slope of the line fitted
(by OLS) to some contiguous block of 58 points
in the series differs from the slope of the line fitted (by OLS) to the entire series of 137 points,
and that both estimated slopes are statistically significant at level 5%.
Justify your choice of the number of replications.
-
(e) By simulation, estimate the chance that the sign of the slope of the line fitted (by OLS) to some contiguous
block of at least 30 points
in the series differs from the slope of the line fitted (by OLS) to the entire series of 137 points,
and that both estimated slopes are statistically significant at level 5%.
Justify your choice of the number of replications.
-
Now consider a different generating process:
X(0) = 0. X(1) = 1.
P([X(i) - X(i-1)] = [X(i-1) - X(i-2)]) = p, and
P([X(i) - X(i-1)] = -[X(i-1) - X(i-2)]) = 1-p, i = 2, … 136.
By simulation, estimate the probabilities in parts 1(a)–1(e) (above) when this process (rather than the random walk)
generates the data, for p = 0.7, 0.8, and 0.9.
-
What do you conclude about the significance of estimated regression coefficients when the regression
model did not generate the data?
What do you conclude about the climate change study? Discuss.
-
Read Freedman, SMTP, Chapter 7; Freedman, SMCI, Chapters 12, 13; White et al. (2011).
[Due 2/27].
-
Freedman, SMTP, problems 7.B.2, 7.B.3, 7.C.5, 7.D.7, 7.E.2, 7.E.3, 7.E.10, 7.5.2, 7.5.3, 7.5.4, 7.5.5
-
As we discussed in class, the experimental design used by White et al. does not match the way they
analyzed the data.
Their design was stratified on various things (study center, severity of disease, etc.), but Fisher's exact test
and the Kruskal-Wallis test assume simple randomization without stratification.
Moreover, the study does not seem to account for multiplicity in the use of Fisher's exact test to compare
three pairs of treatments.
This assignment will look at the effect of the mismatch between the design and the analysis and
the failure to take into account multiplicity on apparent p-values.
We have a population of 632 subjects (White et al. had 641 and then some were lost or excluded, and
some responses were imputed;
we're simplifying slightly).
158 subjects are assigned at random to each of four treatments.
Consider a binary outcome variable, for instance, a variable that is 1 if
at 52 weeks, the subject has improved by
either 2 or more points on the Chalder fatigue questionnaire or by 8 or more points on the short form-36,
and has improved on both; and that is zero otherwise.
Let N denote the total number of 1s among the 632 subjects.
-
Suppose N=80 for the moment. Allocate those 80 1s at random to the four treatment groups (control and
three others).
Find the three p-values for pairwise comparisons of control to each of the other
three treatments using
Fisher's exact test.
Repeat the random allocation 1,000 times.
What's the estimated chance that at least one p-value is below 0.05?
What's the estimated chance that at least two p-values are below 0.05?
Plot the empirical CDF of the smallest p-value in each each simulation.
Repeat this simulation for N=160 and N=320 and report the results.
-
The previous simulation ignored the stratification by centers.
Invent a generalization of Fisher's exact test that takes stratification into account:
the randomization across treatments does not mix across centers.
Think of at least three ways to combine results across strata to get an overall test statistic.
Explain what alternatives they should have the most power against.
-
Code the test in the previous question that you like best.
Base the p-value on simulation, since the test statistic no longer has a hypergeometric
distribution.
-
Suppose centers 1 and 2 have 106 subjects and centers 3—6 each have 105 subjects.
Suppose that the reported results are as follows, where
the numbers in parentheses are the number allocated to the treatment and the numbers not in parentheses
are the number of 1s in the group.
Center | control | treatment 1 | treatment 2 | treatment 3 |
1 | 10 (27) | 15 (27) | 20 (26) | 20 (26) |
2 | 10 (27) | 15 (27) | 20 (26) | 20 (26) |
3 | 10 (27) | 15 (26) | 20 (26) | 20 (26) |
4 | 20 (27) | 15 (26) | 15 (26) | 10 (26) |
5 | 20 (27) | 15 (26) | 15 (26) | 10 (26) |
6 | 20 (27) | 15 (26) | 15 (26) | 10 (26) |
For the three paired comparisons with control, compare simulated p-values that take stratification into
account with the p-values for Fisher's exact test (which ignores stratification).
Try to find different sets of reported results that would make the two p-values differ as much
as possible for some paired comparison.
What happens if the centers have different sizes? Can you use Simpson's paradox to construct
examples where the sign of the effect is reversed?
-
Read Read Golomb et al. 2010; Jönrup, H. and B. Rennermalm 1976; Kaptchuk et al. 2006; Berk et al. 2009 and 2011.
[Due 3/19].
-
Simulate 1,000 iid N(0,1) random variables.
Take the subset that are larger than 2.
Find a 1-sided (upper) p-value of the z-test of the hypothesis that the subset you selected is a
random sample from a N(0,1) population.
Repeat this overall simulation 1,000 times.
Plot the empirical cdf of the p-values.
What fraction are below 0.1?
Why is that fraction so much larger than 0.1? Isn't the null hypothesis true? Discuss.
-
Simulate 1,000 iid N(0,1) random variables, as before, but instead of selecting those that are larger than 2,
select the 50 that are largest.
Find a 1-sided (upper) p-value of the z-test of the hypothesis that the subset you selected is a random sample from a N(0,1)
population.
Repeat this overall simulation 1,000 times.
Plot the empirical cdf of the p-values.
What fraction are below 0.1?
Why is that fraction so much larger than 0.1? Isn't the null hypothesis true? Discuss.
What's the difference between this situation and the first situation?
-
Reproduce the simulations described in the "Simulation Results" section of the Berk et al. 2009 paper,
that produced figures 3–7. Reproduce the figures.
Repeat the simulations, this time constructing 95% confidence intervals for any variables that are selected.
What fraction of the confidence intervals constructed cover their corresponding parameter?
Is there a notable difference between the coefficient in the model that is actually zero and those
that are not? Discuss.
-
Simulate 600 iid N(0,1) random variables. Divide them into 6 groups of 100.
Perform multiple linear regression of the first group onto the following 20 variables:
the 5 other groups, the squares of the 5 other groups, the cubes of the other 5 groups,
and the reciprocals of the other 5 groups.
Select any estimated coefficients that are statistically significant at level 0.05.
Construct 95% confidence intervals for just those "significant" coefficients.
Note the number of confidence intervals you constructed, and the fraction of them that include
zero—the true population value of all the coefficients in this set-up.
Repeat the simulation of 600 variables, the regression, the selection, and the construction of confidence
intervals, a total of 1,000 times.
What fraction of simulations gave one or more confidence intervals?
What fraction of simulations gave one or more confidence intervals that did not contain zero?
What fraction of the confidence intervals you constructed overall contained zero?
Discuss.
-
Read McCormick et al. 2012; Pan et al. 2012; Freedman, SMCI, Chapter 11.
[Due 4/9].
Comment critically (not necessarily negatively) on McCormick et al.:
-
How were the data obtained? What kind of sample was it?
-
How did they adjust for possible confounders? Stratification? Regression?
A combination of the two?
Do they give a simple crosstab of the data?
-
What techniques were used in the analysis?
Comment on the assumptions required for those techniques to be reliable, and
discuss whether the assumptions are plausible in this application.
-
Is multiplicity an issue in their analysis?
If so, how (if at all) did they account for it?
-
The paper presents confidence intervals for various things.
What assumptions are those confidence intervals based on?
What, precisely, is random in this study?
-
Do they use model selection? Do they make confidence intervals for coefficients in
the selected model?
Comment.
-
What are the three best things about the study?
What part of their argument is most convincing?
-
If you were designing the study and the data analysis, what are the three things
you think are most important to do differently?
Explain why they are important and how the approach taken in the paper
might be misleading.
-
Do you believe the findings? Why or why not?
P.B. Stark, statistics.berkeley.edu/~stark.
http://statistics.berkeley.edu/~stark/Teach/S215B/S12/index.htm
Last modified 9 April 2012.