Machine learning often relies on costly labeled data, and this impedes its application to new classification and infor-mation extraction problems. This has motivated the devel-opment of methods for leveraging abundant prior knowledge about these problems, including methods for lightly super-vised learning using model expectation constraints. Building on this work, we envision an interactive training paradigm in which practitioners perform evaluation, analyze errors, and provide and refine expectation constraints in a closed loop. In this paper, we focus on several key subproblems in this paradigm that can be cast as selecting a represen-tative sample of the unlabeled data for the practitioner to inspect. To address these problems, we propos...
It is difficult to apply machine learning to many real-world tasks because there are no existing lab...
Most positive and unlabeled data is subject to selection biases. The labeled examples can, for examp...
Standard machine learning approaches require labeled data, and labeling data for each task, language...
Machine learning often relies on costly labeled data, which impedes its application to new classific...
In a real-world application of supervised learning, we have a training set of examples with labels, ...
A constant challenge to researchers is the lack of large and timely datasets of domain examples (res...
A constant challenge to researchers is the lack of large and timely datasets of domain examples (res...
A constant challenge to researchers is the lack of large and timely datasets of domain examples (res...
In the passive, traditional, approach to learning, the information available to the learner is a set...
Uncertainty sampling methods iteratively request class labels for training instances whose classes a...
It is an actual and challenging issue to learn cost-sensitive models from those datasets that are wi...
Given k pre-trained classifiers and a stream of unlabeled data examples, how can we actively decide ...
It is difficult to apply machine learning to new domains because often we lack labeled problem insta...
Thesis (Ph.D.)--Boston UniversityIn a typical discriminative learning setting, a set of labeled trai...
This paper addresses the problem of learn-ing when high-quality labeled examples are an expensive re...
It is difficult to apply machine learning to many real-world tasks because there are no existing lab...
Most positive and unlabeled data is subject to selection biases. The labeled examples can, for examp...
Standard machine learning approaches require labeled data, and labeling data for each task, language...
Machine learning often relies on costly labeled data, which impedes its application to new classific...
In a real-world application of supervised learning, we have a training set of examples with labels, ...
A constant challenge to researchers is the lack of large and timely datasets of domain examples (res...
A constant challenge to researchers is the lack of large and timely datasets of domain examples (res...
A constant challenge to researchers is the lack of large and timely datasets of domain examples (res...
In the passive, traditional, approach to learning, the information available to the learner is a set...
Uncertainty sampling methods iteratively request class labels for training instances whose classes a...
It is an actual and challenging issue to learn cost-sensitive models from those datasets that are wi...
Given k pre-trained classifiers and a stream of unlabeled data examples, how can we actively decide ...
It is difficult to apply machine learning to new domains because often we lack labeled problem insta...
Thesis (Ph.D.)--Boston UniversityIn a typical discriminative learning setting, a set of labeled trai...
This paper addresses the problem of learn-ing when high-quality labeled examples are an expensive re...
It is difficult to apply machine learning to many real-world tasks because there are no existing lab...
Most positive and unlabeled data is subject to selection biases. The labeled examples can, for examp...
Standard machine learning approaches require labeled data, and labeling data for each task, language...