We regard histogram density estimation as a model selection problem. Our approach is based on the information-theoretic minimum description length (MDL) principle, which can be applied for tasks such as data clustering, density estimation, image denoising and model selection in general. MDLbased model selection is formalized via the normalized maximum likelihood (NML) distribution, which has several desirable optimality properties. We show how this framework can be applied for learning generic, irregular (variable-width bin) histograms, and how to compute the NML model selection criterion efficiently. We also derive a dynamic programming algorithm for finding both the MDL-optimal bin count and the cut point locations in polynomial time. Fin...
Let p be an unknown and arbitrary probability distribution over [0, 1). We con-sider the problem of ...
We present a data-adaptive multivariate histogram estimator of an unknown density f based on n indep...
The normalized maximum likelihood (NML) distribution has an important position in minimum descriptio...
We regard histogram density estimation as a model selection problem. Our approach is based on the ...
International audienceG-Enum histograms are a new fast and fully automated method for irregular hist...
cCorresponding Author The Minimum Description Length (MDL) principle is an information theoretic app...
The Minimum Description Length (MDL) principle is a general, well-founded theoretical formalization ...
When considering a data set it is often unknown how complex it is, and hence it is difficult to asse...
Histograms are convenient non-parametric density estimators, which continue to be used ubiquitously....
We consider the problem of model selection using the Minimum Description Length (MDL) criterion for ...
We propose a fully automatic procedure for the construction of irregular histograms. For a given num...
Unsupervised discretization is a crucial step in many knowledge discovery tasks. The state-of-the-ar...
A natural way to estimate the probability density function of an unknown distribution from the sampl...
International audienceA multivariate modified histogram density estimate depending on a reference de...
The minimum description length (MDL) principle originated from data compression literature and has b...
Let p be an unknown and arbitrary probability distribution over [0, 1). We con-sider the problem of ...
We present a data-adaptive multivariate histogram estimator of an unknown density f based on n indep...
The normalized maximum likelihood (NML) distribution has an important position in minimum descriptio...
We regard histogram density estimation as a model selection problem. Our approach is based on the ...
International audienceG-Enum histograms are a new fast and fully automated method for irregular hist...
cCorresponding Author The Minimum Description Length (MDL) principle is an information theoretic app...
The Minimum Description Length (MDL) principle is a general, well-founded theoretical formalization ...
When considering a data set it is often unknown how complex it is, and hence it is difficult to asse...
Histograms are convenient non-parametric density estimators, which continue to be used ubiquitously....
We consider the problem of model selection using the Minimum Description Length (MDL) criterion for ...
We propose a fully automatic procedure for the construction of irregular histograms. For a given num...
Unsupervised discretization is a crucial step in many knowledge discovery tasks. The state-of-the-ar...
A natural way to estimate the probability density function of an unknown distribution from the sampl...
International audienceA multivariate modified histogram density estimate depending on a reference de...
The minimum description length (MDL) principle originated from data compression literature and has b...
Let p be an unknown and arbitrary probability distribution over [0, 1). We con-sider the problem of ...
We present a data-adaptive multivariate histogram estimator of an unknown density f based on n indep...
The normalized maximum likelihood (NML) distribution has an important position in minimum descriptio...