Abstract When first faced with a learning task, it is often not clear what a satisfactory representation of the training data should be, and we are often forced to create some set of features that appear plausible, without any strong confidence that they will yield superior learning. Moreover, we often do not have any prior knowledge of what learning method is best to apply, and thus often try multiple methods in an attempt to find the one that performs best. This paper describes a method called Feature-mine, that takes a set of features and augments them with macro-features that test for the occurrence of combinations of values on the original features. Our approach uses associations that are mined from the data to create these new feature...
In many datasets, there is a very large number of attributes (e.g. many thousands). Such datasets ca...
A key objective of data mining is to uncover the hidden relationships among the objects in given dat...
With advanced computer technologies and their omnipresent usage, data accumulates in a speed unmatch...
Abstract Image mining requires the extraction of features from im-age data. Hundreds of features can...
Classification is a widely used technique in the data mining domain, where scalability and efficienc...
Learning Classifier Systems (LCS) are a well-known machine learning method, producing sets of interp...
During past few decades, researchers worked on data preprocessing techniques for the datasets. Data ...
This is an electronic version of the paper presented at the III Taller de Minería de Datos y Aprendi...
Data mining is the process of analyzing data from different perspectives and summarizing it into use...
A central problem in machine learning is identifying a representative set of features from which to ...
1 Introduction The process of feature selection, also known as attribute subset selection is a key f...
This thesis introduces two novel machine learning methods of feature ranking and feature selection....
Datasets found in real world applications of machine learning are often characterized by low-level a...
Data mining is a process of extracting knowledge from underlying huge multidimensional data. Data mi...
Feature Subset Selection (FSS) is to select a subset of features from the feature space taking into ...
In many datasets, there is a very large number of attributes (e.g. many thousands). Such datasets ca...
A key objective of data mining is to uncover the hidden relationships among the objects in given dat...
With advanced computer technologies and their omnipresent usage, data accumulates in a speed unmatch...
Abstract Image mining requires the extraction of features from im-age data. Hundreds of features can...
Classification is a widely used technique in the data mining domain, where scalability and efficienc...
Learning Classifier Systems (LCS) are a well-known machine learning method, producing sets of interp...
During past few decades, researchers worked on data preprocessing techniques for the datasets. Data ...
This is an electronic version of the paper presented at the III Taller de Minería de Datos y Aprendi...
Data mining is the process of analyzing data from different perspectives and summarizing it into use...
A central problem in machine learning is identifying a representative set of features from which to ...
1 Introduction The process of feature selection, also known as attribute subset selection is a key f...
This thesis introduces two novel machine learning methods of feature ranking and feature selection....
Datasets found in real world applications of machine learning are often characterized by low-level a...
Data mining is a process of extracting knowledge from underlying huge multidimensional data. Data mi...
Feature Subset Selection (FSS) is to select a subset of features from the feature space taking into ...
In many datasets, there is a very large number of attributes (e.g. many thousands). Such datasets ca...
A key objective of data mining is to uncover the hidden relationships among the objects in given dat...
With advanced computer technologies and their omnipresent usage, data accumulates in a speed unmatch...