Abstract. In exploratory data mining it is important to assess the significance of results. Given that analysts have only limited time, it is important that we can measure this with regard to what we already know. That is, we want to be able to measure whether a result is interesting from a subjective point of view. With this as our goal, we formalise how to probabilistically model real-valued data by the Maximum Entropy principle, where we allow statistics on arbitrary sets of cells as background knowledge. As statistics, we consider means and vari-ances, as well as histograms. The resulting models allow us to assess the likeli-hood of values, and can be used to verify the significance of (possibly overlap-ping) structures discovered in th...
A desirable feature of a database system is its ability to reason with probabilistic information. Th...
In this study we illustrate a Maximum Entropy (ME) methodology for modeling incomplete information a...
In this paper we describe and evaluate different statistical models for the task of realization rank...
Recent research has highlighted the practical benefits of subjective interestingness measures, which...
Abstract-Statistical assessment of the results of data mining is increasingly recognised as a core t...
The combination of mathematical models and uncertainty measures can be applied in the area of data m...
The central theme of this dissertation is the statistical analysis of retrieval data. Features commo...
We consider the problem of defining the significance of an itemset. We say that the itemset is signi...
Probabilistic graphical models are a very efficient machine learning technique. However, their only ...
We propose a framework for learning hidden-variable models by optimizing entropies, in which entropy...
The Principle of Maximum Entropy is discussed and two classic probabilistic models of information re...
National audienceIn this paper we consider different entropy-based approaches to Pattern Mining. We ...
Deriving insights from high-dimensional data is one of the core problems in data mining. The difficu...
Our goal is to enhance multidimensional database systems with a suite of advanced operators to autom...
International audienceEntropy gain is widely used for learning decision trees. However, as we go dee...
A desirable feature of a database system is its ability to reason with probabilistic information. Th...
In this study we illustrate a Maximum Entropy (ME) methodology for modeling incomplete information a...
In this paper we describe and evaluate different statistical models for the task of realization rank...
Recent research has highlighted the practical benefits of subjective interestingness measures, which...
Abstract-Statistical assessment of the results of data mining is increasingly recognised as a core t...
The combination of mathematical models and uncertainty measures can be applied in the area of data m...
The central theme of this dissertation is the statistical analysis of retrieval data. Features commo...
We consider the problem of defining the significance of an itemset. We say that the itemset is signi...
Probabilistic graphical models are a very efficient machine learning technique. However, their only ...
We propose a framework for learning hidden-variable models by optimizing entropies, in which entropy...
The Principle of Maximum Entropy is discussed and two classic probabilistic models of information re...
National audienceIn this paper we consider different entropy-based approaches to Pattern Mining. We ...
Deriving insights from high-dimensional data is one of the core problems in data mining. The difficu...
Our goal is to enhance multidimensional database systems with a suite of advanced operators to autom...
International audienceEntropy gain is widely used for learning decision trees. However, as we go dee...
A desirable feature of a database system is its ability to reason with probabilistic information. Th...
In this study we illustrate a Maximum Entropy (ME) methodology for modeling incomplete information a...
In this paper we describe and evaluate different statistical models for the task of realization rank...