In many classification tasks training data have missing feature values that can be acquired at a cost. For building accurate predictive models, acquiring all missing values is often prohibitively expensive or unnecessary, while acquiring a random subset of feature values may not be most effective. The goal of active feature-value acquisition is to incrementally select feature values that are most cost-effective for improving the model’s accuracy. We present two policies, Sampled Expected Utility and Expected Utility-ES, that acquire feature values for inducing a classification model based on an estimation of the expected improvement in model accuracy per unit cost. A comparison of the two policies to each other and to alternative policies d...
It can be expensive to acquire the data required for businesses to employ data-driven predictive mod...
In many knowledge discovery applications the data mining step is followed by further data acquisitio...
Some data analysis applications comprise datasets, where explanatory variables are expensive or tedi...
In many classification tasks training data have missing feature values that can be acquired at a cos...
In many classification tasks training data have missing feature values that can be acquired at a cos...
Many induction problems include missing data that can be acquired at a cost. For building accurate p...
Most induction algorithms for building predictive models take as input training data in the form of ...
Most induction algorithms for building predictive models take as input training data in the form of ...
Many induction problems, such as on-line customer profiling, include missing data that can be acquir...
Traditional active learning tries to identify instances for which the acquisition of the label incre...
Real-world data is noisy and can often suffer from corruptions or incomplete values that may impact ...
This chapter deals with active feature value acquisition for feature relevance estimation in domains...
The general approach for automatically driving data collection using information from previously acq...
Classification is a well-studied problem in machine learning and data mining. Classifier performance...
Knowledge discovery is traditionally performed under a tacit closed-world assumption, in that, induc...
It can be expensive to acquire the data required for businesses to employ data-driven predictive mod...
In many knowledge discovery applications the data mining step is followed by further data acquisitio...
Some data analysis applications comprise datasets, where explanatory variables are expensive or tedi...
In many classification tasks training data have missing feature values that can be acquired at a cos...
In many classification tasks training data have missing feature values that can be acquired at a cos...
Many induction problems include missing data that can be acquired at a cost. For building accurate p...
Most induction algorithms for building predictive models take as input training data in the form of ...
Most induction algorithms for building predictive models take as input training data in the form of ...
Many induction problems, such as on-line customer profiling, include missing data that can be acquir...
Traditional active learning tries to identify instances for which the acquisition of the label incre...
Real-world data is noisy and can often suffer from corruptions or incomplete values that may impact ...
This chapter deals with active feature value acquisition for feature relevance estimation in domains...
The general approach for automatically driving data collection using information from previously acq...
Classification is a well-studied problem in machine learning and data mining. Classifier performance...
Knowledge discovery is traditionally performed under a tacit closed-world assumption, in that, induc...
It can be expensive to acquire the data required for businesses to employ data-driven predictive mod...
In many knowledge discovery applications the data mining step is followed by further data acquisitio...
Some data analysis applications comprise datasets, where explanatory variables are expensive or tedi...