An important subproblem in supervised tasks such as decision tree induction and subgroup discovery is finding an interesting binary feature (such as a node split or a subgroup refinement) based on a numeric or nominal attribute, with respect to some discrete or continuous target variable. Often one is faced with a trade-off between the expressiveness of such features on the one hand and the ability to efficiently traverse the feature search space on the other hand. In this article, we present efficient algorithms to mine binary features that optimize a given convex quality measure. For numeric attributes, we propose an algorithm that finds an optimal interval, whereas for nominal attributes, we give an algorithm that finds an optimal value ...
To find the optimal branching of a nominal attribute at a node in an L-ary decision tree, one is of...
Research Doctorate - Doctor of Philosophy (PhD)Intuitively, the Feature Selection problem is to choo...
There has been a growing interest in representing real-life applications with data sets having binar...
An important subproblem in supervised tasks such as decision tree induction and subgroup discovery i...
Subgroup discovery systems are concerned with finding interesting patterns in labeled data. How thes...
Classification and supervised learning problems in general aim to choose a function that best descri...
We consider data sets that consist of n-dimensional binary vectors representing positive and negativ...
Abstract—Feature subset selection, as a special case of the general subset selection problem, has be...
In this paper we consider Box Clustering, a method for supervised classification that partitions the...
Automated feature discovery is a fundamental problem in machine learning. Although classical feature...
Suppose that an m-simplex is partitioned into n convex regions having disjoint interiors and distinc...
One natural, yet unusual, source of data is the set of queries that are performed on a database. We ...
Pattern set mining has been successful in discovering small sets of highly informative and useful pa...
Identifying a small number of features that can represent the data is a known problem that comes up ...
International audienceSubgroup discovery in labeled data is the task of discovering patterns in the ...
To find the optimal branching of a nominal attribute at a node in an L-ary decision tree, one is of...
Research Doctorate - Doctor of Philosophy (PhD)Intuitively, the Feature Selection problem is to choo...
There has been a growing interest in representing real-life applications with data sets having binar...
An important subproblem in supervised tasks such as decision tree induction and subgroup discovery i...
Subgroup discovery systems are concerned with finding interesting patterns in labeled data. How thes...
Classification and supervised learning problems in general aim to choose a function that best descri...
We consider data sets that consist of n-dimensional binary vectors representing positive and negativ...
Abstract—Feature subset selection, as a special case of the general subset selection problem, has be...
In this paper we consider Box Clustering, a method for supervised classification that partitions the...
Automated feature discovery is a fundamental problem in machine learning. Although classical feature...
Suppose that an m-simplex is partitioned into n convex regions having disjoint interiors and distinc...
One natural, yet unusual, source of data is the set of queries that are performed on a database. We ...
Pattern set mining has been successful in discovering small sets of highly informative and useful pa...
Identifying a small number of features that can represent the data is a known problem that comes up ...
International audienceSubgroup discovery in labeled data is the task of discovering patterns in the ...
To find the optimal branching of a nominal attribute at a node in an L-ary decision tree, one is of...
Research Doctorate - Doctor of Philosophy (PhD)Intuitively, the Feature Selection problem is to choo...
There has been a growing interest in representing real-life applications with data sets having binar...