Bin Yu Publications

Bin Yu


  1. L. Miratrix, J. Jia, B. Yu, B. Gawalt, L. El Ghaoui, L. Barnesmoore, S. Clavier (2014). Concise comparative summaries (CCS) of large text corpora with a human experiment. Annals of Applied Statistics (to appear).

  2. Y. Benjamini and B. Yu (2014). The shuttle estimator for explainable variance in fMRI experiments. Annals of Applied Statistics (to appear).

  3. A. Bloniarz, A. Talwalkar, J. Terhorst, M. Jordan, D. Patterson, B. Yu and Y. Song (2014). Changepoint Analysis for Efficient Variant Calling. Proc. of RECOMB 2014 (to appear).

  4. N. El Karoui, D. Bean, P. Bickel, and B. Yu (2014). Optimal M-estimation in high-dimensional regression. Proceedings of National Academy of Sciences (to appear).

  5. N. El Karoui, D. Bean, P. Bickel, C. Lim, and B. Yu (2014). On robust regression with high-dimensional predictors. Proceedings of National Academy of Sciences (to appear).

  6. C. Lim and B. Yu (2013). Estimation Stability with Cross Validation (ESCV) (submitted).

  7. P. Ma, M. W. Mahoney, B. Yu (2013). A Statistical Perspective on Algorithmic Leveraging. Manuscript.

  8. A. S. Hammonds, C. A. Bristow, W. W. Fisher, R. Weiszmann, S. Wu, V. Hartenstein, M. Kellis, B. Yu, E. Frise, and S. E. Celniker (2013). Spatial expression of transcription factors in Drosophila embryonic organ development. Genome Biology, 14(12), R140.

  9. H. Liu and B. Yu (2013). Asymptotic properties of Lasso+mLS and Lasso+Ridge in sparse high-dimensional linear regression. Electron. J. Statist., 7, 312-3169.

  10. J. Mairal and B. Yu (2013). Supervised Feature Selection in Graphs with Path Coding Penalties and Network Flows. Journal of Machine Learning Research, 14, 2449-2485.

  11. Y. Wang, X. Jiang, B. Yu, M. Jiang (2013). A Hierarchical Bayesian Approach for Aerosol Retrieval Using MISR Data. J. American Statistical Association, 108, 483-493.

  12. B. Yu (2013). Stability. Bernoulli, 19 (4), 1484-1500.

  13. Y. He, J. Jia and B. Yu (2013). Reversible MCMC on Markov equivalence classes of sparse directed acyclic graphs. Annals of Statistics, 41(4), 1742-1779.

  14. C. Uhler, G. Raskutti, and P. Buhlmann and B. Yu (2012). Geometry of faithfulness assumption in causal inference. Annals of Statistics, 41, 436-463.

  15. L. Miratrix, J. Sehkon, and B. Yu (2012). Adjusting Treatment Effect Estimates by Post-Stratification in Randomized Experiments. Journal of Royal Statistical Society, Series B, 75 (part 2), 369-396.

  16. J. Mairal and B. Yu (2012). Complexity analysis of the Lasso regularization path. Proc. of International Conference on Machine Learning (ICML) .

  17. Yanfeng Gu, Shizhe Wang, Tao Shi, Yinghui Lu, Eugene E. Clothiaux, and Bin Yu (2012). Multiple-kernel learning-based unmixing algorithm for estimation of cloud fractions with MODIS and CLOUDSAT data. Proc. of IEEE International Geoscience and Remote Sensing Symposium (IGRSS) .

  18. G. Raskutti, M. Wainwrigt, and B. Yu (2012) Minimax-optimal rates for sparse additive models over kernel classes via convex programming. J. Machine Learning Research, 13, 389-427.

  19. S. Negahban, P. Ravikumar, M. Wainwrigt, and B. Yu (2012) A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers. Statistical Science, 27, 538-557.

  20. J. Jia, K. Rohe and B. Yu (2012) The Lasso under Poisson-like Heteroscadecity. Statistica Sinica, 23, 99-118.

  21. K. Rohe, S. Chatterjee, and B. Yu (2011). Spectral clustering and the high-dimensional Stochastic Block Model. Annals of Statistics, 39 (4), 1878-1915

  22. G. Raskutti, M. Wainwright, B. Yu (2011). Minimax rates of estimation for high-dimensional linear regression over lq-balls. IEEE Trans. Inform. Th., 57(10), 6976-6994.

  23. P. Ravikumar, M. Wainwright, G. Raskutti, B. Yu (2011). High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence. Electronic Journal of Statistics, 5, 935-980.

  24. S. Nishimoto, A. T. Vu, T. Naselaris, Y. Benjamini, B. Yu, J. L. Gallant (2011). Reconstructing visual experiences from brain activity evoked by natural movies. Current Biology, 21(19), 1641-1646. related videos

  25. G. Raskutti, M. Wainwrigt, and B. Yu (2010) Restricted Eigenvalue Properties for Correlated Gaussian Designs. Journal of Machine Learning Research, 11, 2241-2259.

  26. Y. Han, F. Wu, J. Jia, Y. Zhuang and B. Yu (2010). Multi-task Sparse Discriminant Analysis (MtSDA) with Overlapping Categories. Proc. of The 24th AAAI Conference on Artificial Intelligence, July 11-15, Atlanta, GA.

  27. L. Huang, J. Jia, B. Yu, B. Chun, P. Maniatis, M. Naik (2010). Predicting Execution Time of Computer Programs Using Sparse Polynomial Regression. Proc. NIPS 2010.

  28. B. Gawalt, J. Jia, L. Miratrix, L. El Ghaoui, B. Yu, and S. Clavier (2010). Discovering Word Associations in News Media via Feature Selection and Sparse Classification. Proc. 11th ACM SIGMM International Confernece on Multimedia Information Retrieval (MIR).

  29. J. Jia and B. Yu (2010). On model selection consistency of elastic net when p >>n. Statistica Sinica, 10, 595-611.

  30. J. Jia, Y. Benjamini, C. Lim, G. Raskutti, B. Yu (2010). Comment on "Envelope models for parsimonious and efficient multivariate linear regression" by R. D. Cook, B. Li, and F. Chiaromonte. Statistica Sinica, 20, 961-967.

  31. B. Yu (2010). Remembering Leo. Annals of Applied Statistics, 4(4), 1657-1659.

  32. P. Zhao, G. Rocha, and B. Yu (2009). The composite absolute penalties family for grouped and hierarchical variable selection Annals of Statistics 37, 3468?C3497.

  33. N. Meinshausen and B. Yu (2009). Lasso-type recovery of sparse representations for high-dimensional data. Annals of Statistics 37, 246?C270.

  34. S. Negahban, P. Ravikumar, M. Wainwright, and B. Yu (2009). A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers Proc. NIPS, 2009.

  35. G. Raskutti, M. Wainwrigt, and B. Yu (2009) Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness. Proc. NIPS, 2009.

  36. G. V. Rocha, X. Wang, and B. Yu (2009) Asymptotic distribution and sparsistency for l1-penalized parametric M-estimators with applications to linear SVM and logistic regression.

  37. T. Shi, M. Belkin, and B. Yu, (2009) Data Spectroscopy: Eigenspace of Convolution Operator and Clustering Annals of Statistics, 37 (6B), 3960-3984.

  38. Vincent Q. Vu, Bin Yu, Robert E. Kass (2009) Information In The Non-Stationary Case Neural computation 21, 688?C703.

  39. E. Anderes, B. Yu, V. Jovanovic, C. Moroney, M. Garay, A. Braverman, E. Clothiaux (2009) Maximum Likelihood Estimation of Cloud Height from Multi-Angle Satellite Imagery. Annals of Applied Statistics, 3, 902-921

  40. G. Raskutti, M. Wainwright, B. Yu (2009) High-dimensional regression under lq-ball sparsity: Optimal rates of convergence. Proc. of Allerton Conference on Communication, Control, and Computing.

  41. P. Ravikumar, G. Raskutti, M. Wainwright, B. Yu (2008) Model selection in Gaussian graphical models: high-dimensional consistency of l1-regularized MLE. In Adavances in Neural Information Processing Systems (NIPS) 21, 2008.

  42. P. Ravikumer, V. Vu, B. Yu, T. Naselaris, K. Kay, J. Gallant (2008). Nonparametric sparse hiearchical models describe V1 fMRI responses to natural images In Adavances in Neural Information Processing Systems (NIPS) 21, 2008.

  43. G. Rocha, P. Zhao, and B. Yu (2008) A path following algorithm for Sparse Pseudo-Likelihood Inverse Covariance Estimation (SPLICE). Tech. Report 759, Statistics Department, UC Berkeley.

  44. Peter Buhlmann and Bin Yu (2008) Invited discussion on "Evidence contrary to the statistical view of boosting (D. Mease and A. Wyner)". (paper with discussion) Journal of Machine Learning Research 9, 187-194.

  45. T. Shi, M. Belkin, and B. Yu (2008). Data spectroscopy: learning mixture models using eigenspaces of convolution operators. Proc. of ICML 2008.

  46. G. Rocha and B. Yu (2007). Greedy and relaxed approximations to model selection: a simultation study. Festschrift in honor of Jorma Rissannen on the occasion of his 75th birthday. (eds, P. Grunwald, P. Myllymaki, I. Tabus, M. Weinberger, B. Yu), Tampere International Center for Signal Processing, TICSP Series, #38. pp. 63-80.

  47. N. Meinshausen, G. Rocha, and B. Yu (2007). A tale of three cousins: Lasso, L2Boosting, and Danzig Annals of Statistics (invited discussion on Candes and Tao's Danzig Selector paper)

  48. V. Vu, B. Yu, and R. Kass (2007). Coverage Adjusted Entropy Estimation. Statistics and Medicine, 26(21), 4039-4060.

  49. B. Yu (2006). Comments on: Regularization in Statistics, by P. J. Bickel and B. Li. Test, vol. 15 (2), pages 314-316.

  50. J. Gao, H. Suzuki, and B. Yu (2006). Approximation Lasso Methods for Language Modeling. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, pp. 225-232, Sydney.

  51. T. Shi, B. Yu, E. Clothiaux, and A. Braverman (2008). Daytime Arctic Cloud Detection based on Multi-angle Satellite Data with Case Studies. Journal of American Statistical Association. 103( 482), 584-593.

  52. B. Yu (2007). Embracing Statistical Challenges in the Information Technology Age Technometrics (special issue on statistics and information technologies). vol. 49 (3), 237-248.

  53. X. Jiang, Y. Liu, B. Yu and M. Jiang (2007). Comparison of MISR aerosol optical thickness with AERONET measurements in Beijing metropolitan area. Remote Sensing of Environment (Special Issue on Multi-angle Imaging SpectroRadiometer), vol. 107, pp. 45-53.

  54. Peng Zhao and Bin Yu (2006). On Model Selection Consistency of Lasso. J. Machine Learning Research, 7 (nov), 2541-2567.

  55. T. Shi, E. E. Clothiaux, B. Yu, A. J. Braverman, and G. N. Groff (2007). Detection of Daytime Arctic Clouds using MISR and MODIS Data. Remote Sensing of Environment (Special Issue on Multi-angle Imaging SpectroRadiometer), vol. 107, pp. 172-184.

  56. P. Buhlmann and B. Yu (2006). Sparse Boosing Journal of Machine Learning Research ( 7 (June), 1001-1024). This is a shortened and more focused version of Buhlmann and Yu "Boosting, Model Selection, Lasso and Nonnegative Garotte" given below.

  57. T. Shi and B. Yu (2005). Binning in Gaussian Kernel Regularization. Statistica Sinica (special issue on machine learning), 16, 541-567.

  58. G. Liang, N. Taft, and B. Yu (2005). A fast lightweight approach to origin-destination IP traffic estimation using partial measurements. Tech Report 687, Statistics Department, UCB (accepted for Special Issue of IEEE-IT and ACM Networks on data networks, Jan. 2006)

  59. C. D. Giurcaneanu and B. Yu (2005). Efficient algorithms for discrete universal denoising for channels with memeory. Tech. Report 686, Statistics Department, UCB (Proc. ISIT, Sept. 2005)

  60. P. Buhlmann and B. Yu (2005). Boosting, Model Selection, Lasso and Nonnegative Garotte. Tech. Report, UC Berkeley.

  61. Tong Zhang and B. Yu (2005). Boosting with early stopping: convergence and consistency. The Annals of Statistics. Vol. 33, 1538-1579.

  62. D. J. Diner et al (2004). PARAGON: A Systematic, Integrated Approach to Aerosol Observation and Modeling. American Meterological Society, Oct., 1491-1501.

  63. P. Zhao and B. Yu (2004). Stagewise Lasso (old title: Boosted Lasso) Tech. Report #678, Statistics, UC Berkeley (December, 2004; revised in April, 2005. accepted by J. Machine Learning Res, 2007).

  64. R. Jorsten and B. Yu (2004). Compressing genomic and proteomic array images for statistical analyses. Invited chapter in a book on Genomic Engineering.

  65. T. Shi, B. Yu, E. Clothiaux, A. Braverman (2004). Cloud detection over ice and snow using MISR data. Tech. Report 663, Stat Dept, UCB.

  66. P. Buhlmann and B. Yu (2004). Discussion on three boosting papers by Jiang, Lugosi and Vayatis, and Zhang Annals of Statistics. 32 (1): 96-101.

  67. R. Castro, M. Coates, G. Liang, R. Nowak, and B. Yu (2003). Internet Tomography: Recent Developments Statistical Science. Vol. 19(3), 499-517.

  68. G. Liang and B. Yu (2003). Maximum Pseudo Likelihood Estimation in Network Tomography. IEEE Trans. on Signal Processing (Special Issue on Data Networks). 51(8), 2043-2053

  69. R. Jornsten, W. Wang, B. Yu, and K. Ramchandran (2003). Microarray image compression: SLOCO and the effects of information loss. Signal Processing Journal (Special Issue on Genomic Signal Processing). 83, 859-869.

  70. Rebecka Jornsten and Bin Yu (2003). Simultaneous Gene Clustering and Subset Selection for Classification via MDL. Bioinformatics. 19(9): 1100-1109.

  71. Mark Hansen and Bin Yu (2002). Minimum Description Length Model Selection Criteria for Generalized Linear Models. {\em Science and Statistics: Festschrift for Terry Speed}, IMS Lecture Notes -- Monograph Series, Vol. 40.

  72. Rebecka Jornsten, and Bin Yu (2002). Multiterminal Estimation: Extensions and a Geometric interpretation. Proceedings of International Symposium on Information Theory (ISIT), June, 2002.

  73. Peter Buhlmann and Bin Yu (2003). Boosting with the L2 Loss: Regression and Classification. J. Amer. Statist. Assoc. 98, 324-340.

  74. Mark Coates, Alfred Hero, Robert Nowak, and Bin Yu (2002). Internet Tomography. Signal Processing Magazine. vol. 19, No. 3 (May issue), 47-65.

  75. Gerald Schuller, Bin Yu, Dawei Huang, and Bern Edler (2002). Perceptual Audio Coding using Pre- and Poster- Filters and Lossless Compression. IEEE Trans. Speech and Audio Processing. Vol. 10 (6), 379-390

  76. Rebecka Jornsten and Bin Yu (2000). ``Comprestimation": Microarray Images in Abundance. Proc. of Conference on Information Science and Systems. Princeton, March 14-17, 2000.

  77. Jin Cao, Scott Vander Wiel, Bin Yu, and Zhenyuan Zhu (2000). [ PDF | A Scalable Method for Estimating Network Traffic Matrices from Link Counts. Preprint.

  78. Peter Buhlmann and Bin Yu (2002). Analyzing Bagging. Annals of Statistics vol. 30, 927-961.

  79. Peter Buhlmann and Bin Yu (2000a). Discussion. Additive logistic regression: a statistical view of boosting, by Friedman, J., Hastie, T. and Tibshirani, R. Annals of Statistics. Vol. 28, 377-386

  80. Mark Hansen and Bin Yu (2000). Wavelet thresholding via MDL for natural images. IEEE Trans. Inform. Theory (Special Issue on Information Theoretic Imaging). vol. 46, 1778-1788.

  81. Rebecka Jornsten and Bin Yu (1999). Insensitivity of model estimation in adaptive scalar-quantization for wavelet subband coding. IEEE Trans. Inform. Theory. submitted.

  82. Jin Cao, Drew Davis, Scott Vander Wiel and Bin Yu (2000). [ PDF | Time-varying network tomography: router link data. J. Amer. Statist. Assoc. vol. 95, 1063-1075.

  83. Jorma Rissanen and Bin Yu (2000). Coding and compression: a happy union of theory and practice. J. Amer. Statist. Assoc. (Year 2000 Commemorative Vignette on Engineering and Physical Sciences). vol. 95, 986-988.

  84. Lei Li and Bin Yu (2000). Iterated logarithm expansions of the pathwise code lengths for exponential families. IEEE Trans. Inform. Theory. vol. 46, 2683-2689.

  85. Bin Yu (1999). Codes and Models. IMS Special Invited Paper (Talk Slides) , ENAR/IMS/ASA Spring Meeting, Atlanta, March.
    [ PDF | PostScript ]

  86. Mark Hansen and Bin Yu (2001). Model selection and Minimum Description Length principle. J. Amer. Statist. Assoc. vol. 96, 746-774.

  87. G. Chang, B. Yu and M. Vetterli (2000). Spatially adaptive wavelet thresholding based on context modeling for image denoising. IEEE Trans. Image Processing, vol. 9, 1522-1531.

  88. G. Chang, B. Yu and M. Vetterli (2000). Adaptive wavelet thresholding for image denoising and compression. IEEE Trans. Image Processing, vol. 9, 1532-1546.

  89. G. Chang, B. Yu and M. Vetterli (2000). Wavelet thresholding for multiple noisy image copies. IEEE Trans. Image Processing, vol. 9, 1631-1635.

  90. Y. Yoo, A. Ortega, and B. Yu (1999). Image subband coding using context-based classification and adaptive quantization. IEEE Trans. Image Processing, vol. 8, 1702-1215.

  91. P. Gong, R. Pu and B. Yu (1999) Conifer species recognition: effects of data transformatoin and band width. Remote Sensing of Environment, (in press).

  92. B. Yu, M. Ostland, P. Gong and R. Pu (1999). Penalized discriminant analysis of in situ hyperspectral data for conifer species recognition. IEEE Trans. Geoscience and Remote Sensing, in press.

  93. A. Barron, J. Rissanen, and B. Yu (1998). The Minimum Description Length principle in coding and modeling. (Special Commemorative Issue: Information Theory: 1948-1998) IEEE. Trans. Inform. Th., Oct, 2743-2760.

  94. B. Yu and P. Mykland (1998). Looking at Markov samplers through cusum path plots: a simple diagnostic idea. Statistics and Computing , 8, 275-286.

  95. M. Ostland and B. Yu (1997). Exploring quasi Monte Carlo for marginal density approximation. Statistics and Computing, 7, 217-228.

  96. P. Gong, R. Pu, and B. Yu (1997). Conifer species recognition with in Situ hyperspectral data. Remote Sensing of Environment, 62, 189-200.

  97. B. Yu and T. P. Speed (1997). Information and the clone mapping of chromosomes. Ann. Statist. 25, 169-185.

  98. D. Nelson, T. Speed, and B. Yu (1997). The limits of random fingerprinting. Genomics, 40, 1-12.

  99. B. Yu (1997). Assouad, Fano, and Le Cam. Festschrift for Lucien Le Cam . D. Pollard, E. Torgersen, and G. Yang (eds), pp. 423-435, Springer-Verla g.

  100. G. Chang, B. Yu, and M. Vetterli (1997). Bridging compression to wavelet thresholding as a denoising method. In Proc. of the 31st Conference on Information Sciences and Systems, The Johns Hopkins Univ, Maryland.

  101. Y. Yoo, A. Ortega, and B. Yu (1996). Adaptive quantization of image subbands with efficient overhead rate selection. In Proceedings of IEEE International Conference on Image Processing, Lausanne, Switzerland.

  102. B. Yu (1996). Minimum Description Length Principle: a Review. In Proc. of the 30th Conference on Information Sciences and Systems, Princeton Univ, New Jersy.

  103. B. Yu (1996). Lower bounds on expected redundancy for nonparametric classes. IEEE Trans. on Information Theory, 42, 272-275.

  104. J. Rissanen and B. Yu (1995). MDL learning. In Learning and Geometry: Computational Approaches, Progress in Computer Science and Applied Logic, 14, David Kueker and Carl Smith (eds), Birkhäuser, Boston, pp. 3-19.

  105. B. Yu (1995). Comment: Extracting more diagnostic information from a single run using cusum path plot. Statist. Sci., 10, 54-58.

  106. P. Mykland, L. Tierney, and B. Yu (1995). Regeneration in Markov Chain samplers. J. Amer. Statist. Assoc., 90, 233-241.

  107. B. Yu (1994a). Rates of convergence for empirical processes of stationary mixing sequences. Ann. Probab. 22, 94-116.

  108. M. Arcones and B. Yu (1994a). Central limit theorems for empirical and U-processes of stationary mixing sequences. J. Theor. Probab. 7, 47-71.

  109. B. Yu (1994b). Lower bound on the expected redundancy for classes of continuous Markov sources. In Statistical Decision Theory and Related Topics V, S. S. Gupta and J. O. Berger (eds), 453-466.

  110. M. Arcones and B. Yu (1994b). Limit theorems for empirical processes under dependence. In Proc. in Chaos expansions, multiple Itô--Wiener integrals and their applications. 205-221.

  111. A. R. Barron, Y. Yang and B. Yu (1994). Asymptotically optimal function estimation by minimum complexity criteria. In Proceedings of 1994 International Symposium on Information Theory, pp. 38, Trondheim, Norway.

  112. B. Yu (1993). Density estimation in the norm for dependent data with applications to the Gibbs sampler. Ann. Statist. 21, 711-735.

  113. B. Yu and T. Speed (1993). A rate of convergence result for a universal D-semifaithful code. IEEE Trans. on Information Theory 39, 8813-820.

  114. T. Speed and B. Yu (1993). Model selection and prediction: normal regression. J. Inst. Statist. Math. 45, 35-54.

  115. J. Rissanen, T. Speed and B. Yu (1992). Density estimation by stochastic complexity. IEEE Trans. on Information Theory, 38, 315-323.

  116. B. Yu and T. Speed (1992) Data compression and histograms. Probability Theory and Related Fields, 92, 195-229.


  1. B. Yu (1994). Estimating the error of kernel estimators for Markov samplers. Technical Report 409, Department of Statistics, UC Berkeley.

  2. B. Yu (1996). A Statistical analysis of adaptive scalar quant ization based on quantized past data.