We present tree- and list- structured density estimation methods for high dimensional binary/categorical data. Our density estimation models are high dimensional analogies to variable bin width histograms. In each leaf of the tree (or list), the density is constant, similar to the flat density within the bin of a histogram. Histograms, however, cannot easily be visualized in higher dimensions, whereas our models can. The accuracy of histograms fades as dimensions increase, whereas our models have priors that help with generalization. Our models are sparse, unlike high-dimensional histograms. We present three generative models, where the first one allows the user to specify the number of desired leaves in the tree within a Bayesian prior. Th...
Multivariate density estimation is a fundamental problem in Applied Statistics and Machine Learning....
Deep generative models have de facto emerged as state of the art when it comes to density estimation...
<p>Although Bayesian density estimation using discrete mixtures has good performance in modest dimen...
Data Mining can be seen as an extension to statistics. It comprises the preparation of data and the ...
Density estimation is a fundamental problem in statistics, and any attempt to do so in high dimensio...
Histograms are convenient non-parametric density estimators, which continue to be used ubiquitously....
Unsupervised discretization is a crucial step in many knowledge discovery tasks. The state-of-the-ar...
G-Enum histograms are a new fast and fully automated method for irregular histogram construction. By...
We present general sufficient conditions for the almost sure $L_1$-consistency of histogram density ...
We contribute to the study of data binning in density estimation. The particular disadvantage of his...
We demonstrate the benefits of probabilistic representations due to their expressiveness which allow...
Choosing the bin sizes for a histogram can be surprisingly tricky. If there are too few bins, it is ...
Multivariate histograms are difficult to construct due to the curse of dimensionality. Motivated by ...
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Resear...
Abstract(#br)We present a data-adaptive multivariate histogram estimator of an unknown density f bas...
Multivariate density estimation is a fundamental problem in Applied Statistics and Machine Learning....
Deep generative models have de facto emerged as state of the art when it comes to density estimation...
<p>Although Bayesian density estimation using discrete mixtures has good performance in modest dimen...
Data Mining can be seen as an extension to statistics. It comprises the preparation of data and the ...
Density estimation is a fundamental problem in statistics, and any attempt to do so in high dimensio...
Histograms are convenient non-parametric density estimators, which continue to be used ubiquitously....
Unsupervised discretization is a crucial step in many knowledge discovery tasks. The state-of-the-ar...
G-Enum histograms are a new fast and fully automated method for irregular histogram construction. By...
We present general sufficient conditions for the almost sure $L_1$-consistency of histogram density ...
We contribute to the study of data binning in density estimation. The particular disadvantage of his...
We demonstrate the benefits of probabilistic representations due to their expressiveness which allow...
Choosing the bin sizes for a histogram can be surprisingly tricky. If there are too few bins, it is ...
Multivariate histograms are difficult to construct due to the curse of dimensionality. Motivated by ...
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Resear...
Abstract(#br)We present a data-adaptive multivariate histogram estimator of an unknown density f bas...
Multivariate density estimation is a fundamental problem in Applied Statistics and Machine Learning....
Deep generative models have de facto emerged as state of the art when it comes to density estimation...
<p>Although Bayesian density estimation using discrete mixtures has good performance in modest dimen...