The Minimum Description Length principle for online sequence estimateion/prediction in a proper learning setup is studied. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is finitely bounded, implying convergence with probability one, and (b) it additionally specifies the convergence speed. For MDL, in general one can only have loss bounds which are finite but exponentially larger than those for Bayes mixtures. We show that this is even the case if the model class contains only Bernoulli distributions. We derive a new upper bound on the prediction error for countable Bernoulli classes. This implies a small bound (comparable to the one for Bay...
AbstractIn this paper, we consider the problem of on-line prediction in which at each time an unlabe...
The probability of observing xt at time t, given past observations x1 ⋯ xt-1 can be computed if the ...
We study the fundamental problem of learning an unknown, smooth probability function via pointwise B...
The Minimum Description Length principle for online sequence estimation/prediction in a proper learn...
We consider the Minimum Description Length principle for online sequence prediction. If the underlyi...
We consider the Minimum Description Length principle for online sequence prediction. If the underlyi...
Minimum description length (MDL) is an important principle for induction and prediction, with strong...
We study the properties of the Minimum Description Length principle for sequence prediction, conside...
Minimum Description Length (MDL) is an important principle for induction and prediction, with stron...
We study the properties of the Minimum Description Length principle for sequence prediction, conside...
The Minimum Description Length (MDL) principle selects the model that has the shortest code for data...
We study the properties of the MDL (or maximum penalized complexity) estimator fo...
This paper studies sequence prediction based on the monotone Kolmogorov complexity Km = -log m, i.e....
Various optimality properties of universal sequence predictors based on Bayes-mixtures in general, a...
We study online learning under logarithmic loss with regular parametric models. In this setting, eac...
AbstractIn this paper, we consider the problem of on-line prediction in which at each time an unlabe...
The probability of observing xt at time t, given past observations x1 ⋯ xt-1 can be computed if the ...
We study the fundamental problem of learning an unknown, smooth probability function via pointwise B...
The Minimum Description Length principle for online sequence estimation/prediction in a proper learn...
We consider the Minimum Description Length principle for online sequence prediction. If the underlyi...
We consider the Minimum Description Length principle for online sequence prediction. If the underlyi...
Minimum description length (MDL) is an important principle for induction and prediction, with strong...
We study the properties of the Minimum Description Length principle for sequence prediction, conside...
Minimum Description Length (MDL) is an important principle for induction and prediction, with stron...
We study the properties of the Minimum Description Length principle for sequence prediction, conside...
The Minimum Description Length (MDL) principle selects the model that has the shortest code for data...
We study the properties of the MDL (or maximum penalized complexity) estimator fo...
This paper studies sequence prediction based on the monotone Kolmogorov complexity Km = -log m, i.e....
Various optimality properties of universal sequence predictors based on Bayes-mixtures in general, a...
We study online learning under logarithmic loss with regular parametric models. In this setting, eac...
AbstractIn this paper, we consider the problem of on-line prediction in which at each time an unlabe...
The probability of observing xt at time t, given past observations x1 ⋯ xt-1 can be computed if the ...
We study the fundamental problem of learning an unknown, smooth probability function via pointwise B...