Special Issue - Biometrical Journal
5th International Conference on Multiple Comparison Procedures
MCP 2007, Vienna


WWW companion

[R packages] [R code] [Supplementary tables and figures ] [Other discussion]



The information on this website refers to material presented and discussed in

S. Dudoit, H. N. Gilbert, and M. J. van der Laan , "Resampling-based Empirical Bayes Multiple Testing Procedures for Controlling Generalized Tail Probability and Expected Value Error Rates: Focus on the False Discovery Rate," (November 2007). U.C. Berkeley Division of Biostatistics Working Paper Series, Working Paper 228. In review, Biometrical Journal. [ Tech report # 228.]  [ Full Version.]  [ *** BMJ website ]


R packages

The following packages are needed for the simulations performed in the above article and may be downloaded from the Bioconductor Project (Release 1.8) or R Project websites.

    Software: multtest (Version 1.16.0), qvalue (Version 1.1), MASS (Version 7.2-34).
    Experimental data: golubEsets (Version 1.4.3).


R and C code

The sample code below is provided as a means of fostering transparency and reproducibility in research. Details of the simulation parameter space are given below in the supplementary results section and in Section 4 of the paper.

    R Simulation Code (workhorse). [ EBFDRSampleCode.r]

    C Simulation Code (for counting guessed sets of true and false hypotheses). [VS.c]


Supplementary tables and figures

These are additional results for several combinations of parameters examined within the simulation space described in the above article. Each supplemental mini-report contains graphical and numerical summaries of Type I error and average power results as well as estimates of the proportion of true null hypotheses produced as a direct or indirect product of the multiple testing procedure (MTPs) under examination.

The lists below indicate conditions which are common to each set of simulations, sorted by sample size, n. Within each supplement, results are presented for various combinations of correlation structure ("No correlation", "Empirical microarray correlation", "Constant, 0.5", and "Constant, 0.9") and proportion of true null hypotheses (0.50, 0.75, 0.95, 1.00).

For the number of resampled test statistics and sets of true null hypotheses, B, note that we chose to use the same value of B for estimating the distributions of the null test statistics and of the random guessed sets of true null hypotheses as well as in Procedure 3.1 for the number of pairs of null test statistics and guessed sets of true null hypotheses. In practice, these numbers could vary.

Supplements 1-4. A=500 data sets, B=10000 (vectors of) resampled test statistics and sets of true null hypotheses, M=400 hypotheses, common alternative shift parameter d=2.

Supplement 5. A=500 data sets, B=10000 (vectors of) resampled test statistics and sets of true null hypotheses, M=40 hypotheses, common alternative shift parameter d=2. Supplements 6 & 7. A=1000 data sets, B=5000 (vectors of) resampled test statistics and sets of true null hypotheses, M=400 hypotheses, common alternative shift parameter d=2. Supplements 8 & 9. A=1000 data sets, B=5000 (vectors of) resampled test statistics and sets of true null hypotheses, M=40 hypotheses, common alternative shift parameter d=2. Supplements 10 & 11. A=1000 data sets, B=5000 (vectors of) resampled test statistics and sets of true null hypotheses, M=400 hypotheses, common alternative shift parameter d=3. Supplements 12 & 13. A=500 data sets, B=5000 (vectors of) resampled test statistics and sets of true null hypotheses, M=2000 hypotheses, common alternative shift parameter d=2.



Other discussion

Response to Reviewer Comments.

Resampling-based Empirical Bayes Multiple Testing Procedure Flowchart. [Flowchart Process Guide]

Poster Highlighting Working Implementation of Proposed Methods in multtest.

Presented at the workshop, "High Dimensional Statistics in Biology," held at the Isaac Newton Institute for Mathematical Sciences, Cambridge, UK, 31 March to 4 April, 2008. The empirical Bayes methods for several common choices of Type I error rate (FWER, gFWER(k), TPPFP(q), and FDR) are beta-tested on a microarray dataset with a large number of hypotheses and under conditions in which we expect to observe good approximate Type I error control (i.e., moderate levels of correlation, an adequate proportion of significant features, etc.).

A developer package with new documentation and additional features should be available mid to late summer 2008 through the Bioconductor Project website.