Peter Bartlett's Journal Papers and Book Chapters

[ZFB24]	Ruiqi Zhang, Spencer Frei, and Peter L. Bartlett. Trained transformers learn linear models in-context. Journal of Machine Learning Research, 25(49):1--55, 2024. [ bib \| .html \| Abstract ]
[MHW⁺24]	Wenlong Mou, Nhat Ho, Martin Wainwright, Peter L. Bartlett, and Michael Jordan. A diffusion process perspective on posterior contraction rates for parameters. SIAM Journal on Mathematics of Data Science, 2024. (To appear). [ bib ]
[BL24]	Peter L. Bartlett and Philip M. Long. Corrigendum to 'prediction, learning, uniform convergence, and scale-sensitive dimensions'. Journal of Computer and System Sciences, 140:103465, 2024. [ bib ]
[BMD⁺23]	Kush Bhatia, Yi-An Ma, Anca D. Dragan, Peter L. Bartlett, and Michael I. Jordan. Bayesian robustness: A nonasymptotic viewpoint. Journal of the American Statistical Association, 119(546):1112--1123, 2023. [ bib \| DOI ]
[BLB23]	Peter L. Bartlett, Philip M. Long, and Olivier Bousquet. The dynamics of sharpness-aware minimization: Bouncing across ravines and drifting towards wide minima. Journal of Machine Learning Research, 24(316):1--36, 2023. [ bib \| .html \| Abstract ]
[PKBK23]	Juan C. Perdomo, Akshay Krishnamurthy, Peter L. Bartlett, and Sham Kakade. A complete characterization of linear estimators for offline policy evaluation. Journal of Machine Learning Research, 24(284):1--50, 2023. [ bib \| .html \| Abstract ]
[FCB23]	Spencer Frei, Niladri S. Chatterji, and Peter L. Bartlett. Random feature amplification: Feature learning and generalization in neural networks. Journal of Machine Learning Research, 24(303):1--49, 2023. [ bib \| .html \| Abstract ]
[TB23]	Alexander Tsigler and Peter L. Bartlett. Benign overfitting in ridge regression. Journal of Machine Learning Research, 24(123):1--76, 2023. (See also arXiv:2009.14286). [ bib \| .html ]
[CLB22]	Niladri S. Chatterji, Philip M. Long, and Peter L. Bartlett. The interplay between implicit bias and benign overfitting in two-layer linear networks. Journal of Machine Learning Research, 23(263):1--48, 2022. [ bib \| .html ]
[MFWB22b]	Wenlong Mou, Nicolas Flammarion, Martin J. Wainwright, and Peter L. Bartlett. Improved bounds for discretization of Langevin diffusions: Near-optimal rates without convexity. Bernoulli, 28(3):1577--1601, 2022. [ bib \| DOI \| http \| Abstract ]
[CBL22]	Niladri S. Chatterji, Peter L. Bartlett, and Philip M. Long. Oracle lower bounds for stochastic gradient sampling algorithms. Bernoulli, 28(2):1074--1092, 2022. arXiv:2002.00291. [ bib \| http \| Abstract ]
[MFWB22a]	Wenlong Mou, Nicolas Flammarion, Martin J. Wainwright, and Peter L. Bartlett. An efficient sampling algorithm for non-smooth composite potentials. Journal of Machine Learning Research, 23(233):1--50, 2022. [ bib \| .html ]
[CLB21]	Niladri S. Chatterji, Philip M. Long, and Peter L. Bartlett. When does gradient descent with logistic loss find interpolating two-layer networks? Journal of Machine Learning Research, 22(159):1--48, 2021. [ bib \| http \| Abstract ]
[MCC⁺21]	Yi-An Ma, Niladri S. Chatterji, Xiang Cheng, Nicolas Flammarion, Peter L. Bartlett, and Michael I. Jordan. Is there an analog of Nesterov acceleration for gradient-based MCMC? Bernoulli, 27(3):1942--1992, 2021. [ bib \| DOI \| Abstract ]
[MMW⁺21]	Wenlong Mou, Yi-An Ma, Martin J. Wainwright, Peter L. Bartlett, and Michael I. Jordan. High-order Langevin diffusion yields an accelerated MCMC algorithm. Journal of Machine Learning Research, 22(42):1--41, 2021. [ bib \| .html ]
[BL21]	Peter L. Bartlett and Philip M. Long. Failures of model-dependent generalization bounds for least-norm interpolation. Journal of Machine Learning Research, 22(204):1--15, 2021. arXiv:2010.08479. [ bib \| .html ]
[BMR21]	Peter L. Bartlett, Andrea Montanari, and Alexander Rakhlin. Deep learning: a statistical viewpoint. Acta Numerica, 30:87–201, 2021. [ bib \| DOI \| http \| Abstract ]
[BLLT20]	Peter L. Bartlett, Philip M. Long, Gábor Lugosi, and Alexander Tsigler. Benign overfitting in linear regression. Proceedings of the National Academy of Sciences, 117(48):30063--30070, 2020. (arXiv:1906.11300). [ bib \| DOI \| arXiv \| http \| Abstract ]
[MPB⁺20]	Dhruv Malik, Ashwin Pananjady, Kush Bhatia, Koulik Khamaru, Peter L. Bartlett, and Martin J. Wainwright. Derivative-free methods for policy optimization: Guarantees for linear quadratic systems. Journal of Machine Learning Research, 21(21):1--51, 2020. [ bib \| .html ]
[BHL19]	Peter L. Bartlett, David P. Helmbold, and Philip M. Long. Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks. Neural Computation, 31:477--502, 2019. [ bib ]
[BHLM19]	Peter L. Bartlett, Nick Harvey, Christopher Liaw, and Abbas Mehrabian. Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks. Journal of Machine Learning Research, 20(63):1--17, 2019. [ bib \| .html ]
[HB17]	Fares Hedayati and Peter L. Bartlett. Exchangeability characterizes optimality of sequential normalized maximum likelihood and Bayesian prediction. IEEE Transactions on Information Theory, 63(10):6767--6773, October 2017. [ bib \| DOI \| .pdf \| .pdf \| Abstract ]
[RRB14]	J. Hyam Rubinstein, Benjamin Rubinstein, and Peter Bartlett. Bounding embeddings of VC classes into maximum classes. In A. Gammerman and V. Vovk, editors, Festschrift of Alexey Chervonenkis. Springer, 2014. [ bib \| http \| Abstract ]
[TB14]	Ambuj Tewari and Peter L. Bartlett. Learning theory. In Paulo S.R. Diniz, Johan A.K. Suykens, Rama Chellappa, and Sergios Theodoridis, editors, Signal Processing Theory and Machine Learning, volume 1 of Academic Press Library in Signal Processing, pages 775--816. Elsevier, 2014. [ bib ]
[BMN12]	Peter L. Bartlett, Shahar Mendelson, and Joseph Neeman. l₁-regularized linear regression: Persistence and oracle inequalities. Probability Theory and Related Fields, 154(1--2):193--224, October 2012. [ bib \| DOI \| .pdf \| Abstract ]
[NAB⁺12]	Massieh Najafi, David M. Auslander, Peter L. Bartlett, Philip Haves, and Michael D. Sohn. Application of machine learning in the fault diagnostics of air handling units. Applied Energy, 96:347--358, August 2012. [ bib \| DOI ]
[RBHT12]	Benjamin I. P. Rubinstein, Peter L. Bartlett, Ling Huang, and Nina Taft. Learning in a large function space: Privacy preserving mechanisms for SVM learning. Journal of Privacy and Confidentiality, 4(1):65--100, August 2012. [ bib \| http \| Abstract ]
[BRS⁺12]	A. Barth, Benjamin I. P. Rubinstein, M. Sundararajan, J. C. Mitchell, Dawn Song, and Peter L. Bartlett. A learning-based approach to reactive security. IEEE Transactions on Dependable and Secure Computing, 9(4):482--493, July 2012. [ bib \| http \| .pdf \| Abstract ]
[DBW12]	John Duchi, Peter L. Bartlett, and Martin J. Wainwright. Randomized smoothing for stochastic optimization. SIAM Journal on Optimization, 22(2):674--701, June 2012. [ bib \| .pdf \| Abstract ]
[ABRW12]	Alekh Agarwal, Peter Bartlett, Pradeep Ravikumar, and Martin Wainwright. Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization. IEEE Transactions on Information Theory, 58(5):3235--3249, May 2012. [ bib \| DOI \| .pdf \| Abstract ]
[AB11]	Sylvain Arlot and Peter L. Bartlett. Margin-adaptive model selection in statistical learning. Bernoulli, 17(2):687--713, May 2011. [ bib \| .pdf \| Abstract ]
[RBR10]	Benjamin I. P. Rubinstein, Peter L. Bartlett, and J. Hyam Rubinstein. Corrigendum to `shifting: One-inclusion mistake bounds and sample compression' [J. Comput. System Sci 75 (1) (2009) 37-59]. Journal of Computer and System Sciences, 76(3--4):278--280, May 2010. [ bib \| DOI ]
[Bar10]	Peter L. Bartlett. Learning to act in uncertain environments. Communications of the ACM, 53(5):98, May 2010. (Invited one-page comment). [ bib \| DOI ]
[BMP10]	Peter L. Bartlett, Shahar Mendelson, and Petra Philips. On the optimality of sample-based estimates of the expectation of the empirical minimizer. ESAIM: Probability and Statistics, 14:315--337, January 2010. [ bib \| .pdf \| Abstract ]
[RSBN09]	David S. Rosenberg, Vikas Sindhwani, Peter L. Bartlett, and Partha Niyogi. Multiview point cloud kernels for semisupervised learning. IEEE Signal Processing Magazine, 26(5):145--150, September 2009. [ bib \| DOI ]
[RBR09]	Benjamin I. P. Rubinstein, Peter L. Bartlett, and J. Hyam Rubinstein. Shifting: one-inclusion mistake bounds and sample compression. Journal of Computer and System Sciences, 75(1):37--59, January 2009. (Was University of California, Berkeley, EECS Department Technical Report EECS-2007-86). [ bib \| .pdf ]
[LBW08]	Wee Sun Lee, Peter L. Bartlett, and Robert C. Williamson. Correction to the importance of convexity in learning with squared loss. IEEE Transactions on Information Theory, 54(9):4395, September 2008. [ bib \| .pdf ]
[CGK⁺08]	Michael Collins, Amir Globerson, Terry Koo, Xavier Carreras, and Peter L. Bartlett. Exponentiated gradient algorithms for conditional random fields and max-margin Markov networks. Journal of Machine Learning Research, 9:1775--1822, August 2008. [ bib \| .pdf \| Abstract ]
[BW08]	Peter L. Bartlett and Marten H. Wegkamp. Classification with a reject option using a hinge loss. Journal of Machine Learning Research, 9:1823--1840, August 2008. [ bib \| .pdf \| Abstract ]
[Bar08]	Peter L. Bartlett. Fast rates for estimation error and oracle inequalities for model selection. Econometric Theory, 24(2):545--552, April 2008. (Was Department of Statistics, U.C. Berkeley Technical Report number 729, 2007). [ bib \| DOI \| .pdf \| Abstract ]
[TB07]	Ambuj Tewari and Peter L. Bartlett. On the consistency of multiclass classification methods. Journal of Machine Learning Research, 8:1007--1025, May 2007. (Invited paper). [ bib \| .html ]
[BT07a]	Peter L. Bartlett and Ambuj Tewari. Sparseness vs estimating conditional probabilities: Some asymptotic results. Journal of Machine Learning Research, 8:775--790, April 2007. [ bib \| .html ]
[BT07b]	Peter L. Bartlett and Mikhail Traskin. Adaboost is consistent. Journal of Machine Learning Research, 8:2347--2368, 2007. [ bib \| .pdf \| .pdf \| Abstract ]
[BM06a]	Peter L. Bartlett and Shahar Mendelson. Discussion of “2004 IMS Medallion Lecture: Local Rademacher complexities and oracle inequalities in risk minimization” by V. Koltchinskii. The Annals of Statistics, 34(6):2657--2663, 2006. [ bib ]
[BJM06a]	Peter L. Bartlett, Michael I. Jordan, and Jon D. McAuliffe. Comment. Statistical Science, 21(3):341--346, 2006. [ bib ]
[BM06b]	Peter L. Bartlett and Shahar Mendelson. Empirical minimization. Probability Theory and Related Fields, 135(3):311--334, 2006. [ bib \| .ps.gz \| .pdf \| Abstract ]
[BJM06b]	Peter L. Bartlett, Michael I. Jordan, and Jon D. McAuliffe. Convexity, classification, and risk bounds. Journal of the American Statistical Association, 101(473):138--156, 2006. (Was Department of Statistics, U.C. Berkeley Technical Report number 638, 2003). [ bib \| .ps.gz \| .pdf \| Abstract ]
[BBM05]	Peter L. Bartlett, Olivier Bousquet, and Shahar Mendelson. Local Rademacher complexities. Annals of Statistics, 33(4):1497--1537, 2005. [ bib \| .ps \| .pdf \| Abstract ]
[BJM04]	Peter L. Bartlett, Michael I. Jordan, and Jon D. McAuliffe. Discussion of boosting papers. The Annals of Statistics, 32(1):85--91, 2004. [ bib \| .ps.Z \| .pdf ]
[GBB04]	E. Greensmith, P. L. Bartlett, and J. Baxter. Variance reduction techniques for gradient estimates in reinforcement learning. Journal of Machine Learning Research, 5:1471--1530, 2004. [ bib \| .pdf ]
[LCB⁺04]	G. Lanckriet, N. Cristianini, P. L. Bartlett, L. El Ghaoui, and M. Jordan. Learning the kernel matrix with semi-definite programming. Journal of Machine Learning Research, 5:27--72, 2004. [ bib \| .ps.gz \| .pdf ]
[Bar03]	Peter L. Bartlett. An introduction to reinforcement learning theory: value function methods. In Shahar Mendelson and Alexander J. Smola, editors, Advanced Lectures on Machine Learning, volume 2600, pages 184--202. Springer, 2003. [ bib ]
[BM03]	Peter L. Bartlett and Wolfgang Maass. Vapnik-Chervonenkis dimension of neural nets. In Michael A. Arbib, editor, The Handbook of Brain Theory and Neural Networks, pages 1188--1192. MIT Press, 2003. Second Edition. [ bib \| .ps.gz \| .pdf ]
[MBG02]	L. Mason, P. L. Bartlett, and M. Golea. Generalization error of combined classifiers. Journal of Computer and System Sciences, 65(2):415--438, 2002. [ bib \| http ]
[BFH02]	P. L. Bartlett, P. Fischer, and K.-U. Höffgen. Exploiting random walks for learning. Information and Computation, 176(2):121--135, 2002. [ bib \| http ]
[BBD02]	P. L. Bartlett and S. Ben-David. Hardness results for neural network approximation problems. Theoretical Computer Science, 284(1):53--66, 2002. (special issue on Eurocolt'99). [ bib \| http ]
[BB02]	P. L. Bartlett and J. Baxter. Estimation and approximation bounds for gradient-based reinforcement learning. Journal of Computer and System Sciences, 64(1):133--150, 2002. [ bib ]
[BBL02]	P. L. Bartlett, S. Boucheron, and G. Lugosi. Model selection and error estimation. Machine Learning, 48:85--113, 2002. [ bib \| .ps.gz ]
[BM02]	P. L. Bartlett and S. Mendelson. Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3:463--482, 2002. [ bib \| .pdf ]
[GBSTW02]	Y. Guo, P. L. Bartlett, J. Shawe-Taylor, and R. C. Williamson. Covering numbers for support vector machines. IEEE Transactions on Information Theory, 48(1):239--250, 2002. [ bib ]
[BBW01]	J. Baxter, P. L. Bartlett, and L. Weaver. Experiments with infinite-horizon, policy-gradient estimation. Journal of Artificial Intelligence Research, 15:351--381, 2001. [ bib \| .html ]
[BB01]	J. Baxter and P. L. Bartlett. Infinite-horizon policy-gradient estimation. Journal of Artificial Intelligence Research, 15:319--350, 2001. [ bib \| .html ]
[MBB00]	L. Mason, P. L. Bartlett, and J. Baxter. Improved generalization through explicit optimization of margins. Machine Learning, 38(3):243--255, 2000. [ bib ]
[KBB00]	L. C. Kammer, R. R. Bitmead, and P. L. Bartlett. Direct iterative tuning via spectral analysis. Automatica, 36(9):1301--1307, 2000. [ bib ]
[SSWB00]	B. Schölkopf, A. Smola, R. C. Williamson, and P. L. Bartlett. New support vector algorithms. Neural Computation, 12(5):1207--1245, 2000. [ bib ]
[PPB00]	S. Parameswaran, M. F. Parkinson, and P. L. Bartlett. Profiling in the ASP codesign environment. Journal of Systems Architecture, 46(14):1263--1274, 2000. [ bib ]
[BBDK00]	P. L. Bartlett, S. Ben-David, and S. R. Kulkarni. Learning changing concepts by exploiting the structure of change. Machine Learning, 41(2):153--174, 2000. [ bib ]
[SBSS00]	A. J. Smola, P. L. Bartlett, B. Schölkopf, and D. Schuurmans. Introduction to large margin classifiers. In Advances in Large Margin Classifiers, pages 1--29. MIT Press, 2000. [ bib ]
[MBBF00]	L. Mason, J. Baxter, P. L. Bartlett, and M. Frean. Functional gradient techniques for combining hypotheses. In A. J. Smola, P. L. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 221--246. MIT Press, 2000. [ bib ]
[AB00]	M. Anthony and P. L. Bartlett. Function learning from interpolation. Combinatorics, Probability, and Computing, 9:213--225, 2000. [ bib ]
[BST99]	P. L. Bartlett and J. Shawe-Taylor. Generalization performance of support vector machines and other pattern classifiers. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods -- Support Vector Learning, pages 43--54. MIT Press, 1999. [ bib ]
[Bar99]	P. L. Bartlett. Efficient neural network learning. In V. D. Blondel, E. D. Sontag, M. Vidyasagar, and J. C. Willems, editors, Open Problems in Mathematical Systems Theory and Control, pages 35--38. Springer Verlag, 1999. [ bib ]
[BL99]	P. L. Bartlett and G. Lugosi. An inequality for uniform deviations of sample averages from their means. Statistics and Probability Letters, 44(1):55--62, 1999. [ bib ]
[Bar98]	P. L. Bartlett. The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Transactions on Information Theory, 44(2):525--536, 1998. [ bib ]
[BL98]	P. L. Bartlett and P. M. Long. Prediction, learning, uniform convergence, and scale-sensitive dimensions. Journal of Computer and System Sciences, 56(2):174--190, 1998. (special issue on COLT`95). [ bib ]
[KBB98]	L. C. Kammer, R. R. Bitmead, and P. L. Bartlett. Optimal controller properties from closed-loop experiments. Automatica, 34(1):83--91, 1998. [ bib ]
[BV98]	P. L. Bartlett and M. Vidyasagar. Introduction to the special issue on learning theory. Systems and Control Letters, 34:113--114, 1998. [ bib ]
[BK98]	P. L. Bartlett and S. Kulkarni. The complexity of model classes, and smoothing noisy data. Systems and Control Letters, 34(3):133--140, 1998. [ bib ]
[BLL98]	P. L. Bartlett, T. Linder, and G. Lugosi. The minimax distortion redundancy in empirical quantizer design. IEEE Transactions on Information Theory, 44(5):1802--1813, 1998. [ bib ]
[STBWA98]	J. Shawe-Taylor, P. L. Bartlett, R. C. Williamson, and M. Anthony. Structural risk minimization over data-dependent hierarchies. IEEE Transactions on Information Theory, 44(5):1926--1940, 1998. [ bib ]
[LBW98]	W. S. Lee, P. L. Bartlett, and R. C. Williamson. The importance of convexity in learning with squared loss. IEEE Transactions on Information Theory, 44(5):1974--1980, 1998. [ bib ]
[BMM98]	P. L. Bartlett, V. Maiorov, and R. Meir. Almost linear VC dimension bounds for piecewise polynomial networks. Neural Computation, 10(8):2159--2173, 1998. [ bib ]
[SFBL98]	R. E. Schapire, Y. Freund, P. L. Bartlett, and W. S. Lee. Boosting the margin: a new explanation for the effectiveness of voting methods. Annals of Statistics, 26(5):1651--1686, 1998. [ bib ]
[LBW97]	W. S. Lee, P. L. Bartlett, and R. C. Williamson. Correction to `lower bounds on the VC-dimension of smoothly parametrized function classes'. Neural Computation, 9:765--769, 1997. [ bib ]
[Bar97]	P. L. Bartlett. Book review: `Neural networks for pattern recognition,' Christopher M. Bishop. Statistics in Medicine, 16(20):2385--2386, 1997. [ bib ]
[BKP97]	P. L. Bartlett, S. R. Kulkarni, and S. E. Posner. Covering numbers for real-valued function classes. IEEE Transactions on Information Theory, 43(5):1721--1724, 1997. [ bib ]
[BW96]	P. L. Bartlett and R. C. Williamson. The Vapnik-Chervonenkis dimension and pseudodimension of two-layer neural networks with discrete inputs. Neural Computation, 8:653--656, 1996. [ bib ]
[BLW96]	P. L. Bartlett, P. M. Long, and R. C. Williamson. Fat-shattering and the learnability of real-valued functions. Journal of Computer and System Sciences, 52(3):434--452, 1996. (special issue on COLT`94). [ bib ]
[ABIST96]	M. Anthony, P. L. Bartlett, Y. Ishai, and J. Shawe-Taylor. Valid generalisation from approximate interpolation. Combinatorics, Probability, and Computing, 5:191--214, 1996. [ bib ]
[LBW96]	W. S. Lee, P. L. Bartlett, and R. C. Williamson. Efficient agnostic learning of neural networks with bounded fan-in. IEEE Transactions on Information Theory, 42(6):2118--2132, 1996. [ bib ]
[LBW95]	W. S. Lee, P. L. Bartlett, and R. C. Williamson. Lower bounds on the VC-dimension of smoothly parametrized function classes. Neural Computation, 7:990--1002, 1995. (See also correction, Neural Computation, 9: 765--769, 1997). [ bib ]
[Bar94]	P. L. Bartlett. Computational learning theory. In A. Kent and J. G. Williams, editors, Encyclopedia of Computer Science and Technology, volume 31, pages 83--99. Marcel Dekker, 1994. [ bib ]
[Bar93]	P. L. Bartlett. Vapnik-Chervonenkis dimension bounds for two- and three-layer networks. Neural Computation, 5(3):371--373, 1993. [ bib ]
[BD92]	P. L. Bartlett and T. Downs. Using random weights to train multi-layer networks of hard-limiting units. IEEE Transactions on Neural Networks, 3(2):202--210, 1992. [ bib ]
[LBD92]	D. R. Lovell, P. L. Bartlett, and T. Downs. Error and variance bounds on sigmoidal neurons with weight and input errors. Electronics Letters, 28(8):760--762, 1992. [ bib ]

This file was generated by bibtex2html 1.99.