In modern statistical applications, we are often faced with situationswhere there is either too little or too much data. Both extremes canbe troublesome: Interesting models can only be learnt when sufficientamounts of data are available, yet these models tend to becomeintractable when data is abundant. An important thread of researchaddresses these difficulties by subsampling the data prior to learninga model. Subsampling can be active (i.e. active learning) orrandomized. While both of these techniques have a long history, adirect application to novel situations is in many casesproblematic. This dissertation addresses some of these issues.We begin with an active learning strategy for spectral clustering whenthe cost of assessing individual ...
There has been growing recent interest in the field of active learning for binary classification. Th...
Object classification by learning from data is a vast area of statistics and machine learning. Withi...
Abstract. When dealing with datasets containing a billion instances or with sim-ulations that requir...
[[abstract]]Active learning is a kind of semi-supervised learning methods in which learning algorith...
Data subsampling has become widely recognized as a tool to overcome computational and economic bottl...
In this work we proposed a novel transductive method to solve the problem of learning from partially...
In many settings in practice it is expensive to obtain labeled data while unlabeled data is abundant...
Copyright © 2010 IEEE. Personal use of this material is permitted. Permission from IEEE must be obta...
Thesis (Ph.D.)--University of Washington, 2012Active learning is a machine learning setting where th...
Data de-duplication concerns the identification and eventual elimination of records, in a particular...
International audienceWe investigate active learning by pairwise similarity over the leaves of trees...
What data should we gather to learn about the underlying structure of the world as quickly as possib...
Active learning (AL) is a branch of machine learning that deals with problems where unlabeled data i...
Active learning is a machine learning technique in which a learning algorithm is able to interactive...
We describe an adaptation of the simulated annealing algorithm to nonparametric clus-tering and rela...
There has been growing recent interest in the field of active learning for binary classification. Th...
Object classification by learning from data is a vast area of statistics and machine learning. Withi...
Abstract. When dealing with datasets containing a billion instances or with sim-ulations that requir...
[[abstract]]Active learning is a kind of semi-supervised learning methods in which learning algorith...
Data subsampling has become widely recognized as a tool to overcome computational and economic bottl...
In this work we proposed a novel transductive method to solve the problem of learning from partially...
In many settings in practice it is expensive to obtain labeled data while unlabeled data is abundant...
Copyright © 2010 IEEE. Personal use of this material is permitted. Permission from IEEE must be obta...
Thesis (Ph.D.)--University of Washington, 2012Active learning is a machine learning setting where th...
Data de-duplication concerns the identification and eventual elimination of records, in a particular...
International audienceWe investigate active learning by pairwise similarity over the leaves of trees...
What data should we gather to learn about the underlying structure of the world as quickly as possib...
Active learning (AL) is a branch of machine learning that deals with problems where unlabeled data i...
Active learning is a machine learning technique in which a learning algorithm is able to interactive...
We describe an adaptation of the simulated annealing algorithm to nonparametric clus-tering and rela...
There has been growing recent interest in the field of active learning for binary classification. Th...
Object classification by learning from data is a vast area of statistics and machine learning. Withi...
Abstract. When dealing with datasets containing a billion instances or with sim-ulations that requir...