We study the interaction between input distributions, learning algo-rithms, and finite sample sizes in the case of learning classification tasks. Focusing on the case of normal input distributions, we use sta-tistical mechanics techniques to calculate the empirical and expected (or generalization) errors for several well-known algorithms learning the weights of a single-layer perceptron. In the case of spherically symmetric distributions within each class we find that the simple Hebb rule, corresponding to maximum-likelihood parameter estimation, out-performs the other more complex algorithms, based on error minimiza-tion. Moreover, we show that in the regime where the overlap between the classes is large, algorithms with low empirical erro...
The learning of a pattern classification rule rests on acquiring information to constitute a decisio...
In this paper we show how to extract a hypothesis with small risk from the ensemble of hypotheses ge...
In this paper, we consider the problem of learning a subset of a domain from randomly chosen example...
We study the interaction between input distributions, learning algorithms and finite sample sizes in...
We study model selection strategies based on penalized empirical loss minimization. We point out a...
We investigate to which extent one can recover class probabilities within the empirical risk minimiz...
Abstract The generalization ability of minimizers of the empirical risk in the context of binary cla...
Abstract. We study model selection strategies based on penalized empirical loss minimization. We poi...
The derivation of tight generalization bounds has remained an open problem in the statistical learni...
Plotting a learner’s average performance against the number of training samples results in a learnin...
How can we select the best performing data-driven model? How can we rigorously estimate its generali...
How can we select the best performing data-driven model? How can we rigorously estimate its generali...
How can we select the best performing data-driven model? How can we rigorously estimate its generali...
How can we select the best performing data-driven model? How can we rigorously estimate its generali...
We compare two strategies for training connectionist (as well as non-connectionist) models for stati...
The learning of a pattern classification rule rests on acquiring information to constitute a decisio...
In this paper we show how to extract a hypothesis with small risk from the ensemble of hypotheses ge...
In this paper, we consider the problem of learning a subset of a domain from randomly chosen example...
We study the interaction between input distributions, learning algorithms and finite sample sizes in...
We study model selection strategies based on penalized empirical loss minimization. We point out a...
We investigate to which extent one can recover class probabilities within the empirical risk minimiz...
Abstract The generalization ability of minimizers of the empirical risk in the context of binary cla...
Abstract. We study model selection strategies based on penalized empirical loss minimization. We poi...
The derivation of tight generalization bounds has remained an open problem in the statistical learni...
Plotting a learner’s average performance against the number of training samples results in a learnin...
How can we select the best performing data-driven model? How can we rigorously estimate its generali...
How can we select the best performing data-driven model? How can we rigorously estimate its generali...
How can we select the best performing data-driven model? How can we rigorously estimate its generali...
How can we select the best performing data-driven model? How can we rigorously estimate its generali...
We compare two strategies for training connectionist (as well as non-connectionist) models for stati...
The learning of a pattern classification rule rests on acquiring information to constitute a decisio...
In this paper we show how to extract a hypothesis with small risk from the ensemble of hypotheses ge...
In this paper, we consider the problem of learning a subset of a domain from randomly chosen example...