Neural Information Processing Systems 1998 Tutorial:

Learning Theory and Generalization for Neural Networks and Other Supervised Learning Techniques

Peter L. Bartlett

Abstract

This tutorial will provide an introduction to the theory of the generalization performance of supervised learning techniques. It will explain several key models and describe the main results relating the generalization performance of a learning system to its complexity. The discussion will concentrate on pattern classification and real prediction problems, using neural networks as examples, but these results are of considerable importance in understanding a much broader variety of phenomena in machine learning. The latter part of the tutorial will concentrate on recent advances that exploit these results to provide new analyses of large margin classifiers. Many pattern classifiers, such as neural networks and support vector machines, and techniques for combining classifiers, such as boosting and bagging, predict class labels by thresholding real-valued functions, and tend to have a large margin between the predicted value and an incorrect prediction. This part of the tutorial will focus on large margin classifiers, presenting results on the generalization performance of these classifiers, and explaining why their size is not the most appropriate measure of their complexity.

Tutorial Slides

(241K compressed postscript)

Bibliography

(24K compressed postscript)

Last update: Wed Nov 11 11:26:41 EST 1998