We propose a learning setting in which unlabeled data is free, and the cost of a label depends on its value, which is not known in advance. We study binary classification in an extreme case, where the algorithm only pays for negative labels. Our motivation are applications such as fraud detection, in which investigating an honest transaction should be avoided if possible. We term the setting auditing, and consider the auditing complexity of an algorithm: the number of negative labels the algorithm requires in order to learn a hypothesis with low relative error. We design auditing algorithms for simple hypothesis classes (thresholds and rectangles), and show that with these algorithms, the auditing complexity can be significantly lower than ...
We describe a framework for designing efficient active learning algorithms that are tolerant to rand...
In this position paper we introduce Active In-ference, a paradigm for intelligently request-ing huma...
In many settings in practice it is expensive to obtain labeled data while unlabeled data is abundant...
We propose a learning setting in which unlabeled data is free, and the cost of a label depends on it...
We study active learning where the labeler can not only return incorrect labels but also abstain fro...
This dissertation develops and analyzes active learning algorithms for binary classification problem...
Machine learning algorithms detects patterns, regularities, and rules from the training data and adj...
This thesis studies active learning and confidence-rated prediction, and the interplay between these...
The original and most widely studied PAC model for learning assumes a passive learner in the sense t...
There has been growing recent interest in the field of active learning for binary classification. Th...
In this paper we propose and study a generalization of the standard active-learning model where a mo...
Recent decades have witnessed great success of machine learning, especially for tasks where large an...
The original and most widely studied PAC model for learning assumes a passive learner in the sense t...
Thesis (Ph.D.)--Boston UniversityIn a typical discriminative learning setting, a set of labeled trai...
We investigate learnability in the PAC model when the data used for learning, attributes and labels,...
We describe a framework for designing efficient active learning algorithms that are tolerant to rand...
In this position paper we introduce Active In-ference, a paradigm for intelligently request-ing huma...
In many settings in practice it is expensive to obtain labeled data while unlabeled data is abundant...
We propose a learning setting in which unlabeled data is free, and the cost of a label depends on it...
We study active learning where the labeler can not only return incorrect labels but also abstain fro...
This dissertation develops and analyzes active learning algorithms for binary classification problem...
Machine learning algorithms detects patterns, regularities, and rules from the training data and adj...
This thesis studies active learning and confidence-rated prediction, and the interplay between these...
The original and most widely studied PAC model for learning assumes a passive learner in the sense t...
There has been growing recent interest in the field of active learning for binary classification. Th...
In this paper we propose and study a generalization of the standard active-learning model where a mo...
Recent decades have witnessed great success of machine learning, especially for tasks where large an...
The original and most widely studied PAC model for learning assumes a passive learner in the sense t...
Thesis (Ph.D.)--Boston UniversityIn a typical discriminative learning setting, a set of labeled trai...
We investigate learnability in the PAC model when the data used for learning, attributes and labels,...
We describe a framework for designing efficient active learning algorithms that are tolerant to rand...
In this position paper we introduce Active In-ference, a paradigm for intelligently request-ing huma...
In many settings in practice it is expensive to obtain labeled data while unlabeled data is abundant...