We consider the problem of clustering n items into K disjoint clusters using noisy answers from crowdsourced workers to pairwise queries of the type: “Are items i and j from the same cluster?” We propose a novel, practical, simple, and computationally efficient active querying algorithm for crowdsourced clustering. Furthermore, our algorithm does not require knowledge of unknown problem parameters. We show that our algorithm succeeds in recovering the clusters when the crowdworkers provide answers with an error probability less than 1/2 and provide sample complexity bounds on the number of queries made by our algorithm to guarantee successful clustering. While the bounds depend on the error probabilities, the algorithm itself does not requi...
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for a...
We study the problem of frequent itemset mining in domains where data is not recorded in a conventio...
We consider the following problem: given a set of clusterings, find a single clustering that agrees ...
This thesis focuses on solving the $K$-means clustering problem approximately with side information ...
A wide range of applications in engineering as well as the natural and social sciences have datasets...
Crowdsourcing utilizes human ability by distributing tasks to a large number of workers. It is espec...
International audienceIn correlation clustering, we are given $n$ objects together with a binary sim...
Clustering is inherently ill-posed: there often exist multiple valid clusterings of a single dataset...
International audienceActive learning for semi-supervised clustering allows algorithms to solicit a ...
Constraint-based clustering leverages user-provided constraints to produce a clustering that matches...
We consider the problem of crowdsourced clustering of a set of items based on queries of the similar...
Due to the widespread use and importance of crowdsourcing in gathering training data at scale, the d...
We study the problem of frequent itemset mining in domains where data is not recorded in a conventio...
This work studies clustering algorithms which operates with ordinal or comparison-based queries (ope...
© Springer Nature Switzerland AG 2018. Constraint-based clustering algorithms exploit background kno...
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for a...
We study the problem of frequent itemset mining in domains where data is not recorded in a conventio...
We consider the following problem: given a set of clusterings, find a single clustering that agrees ...
This thesis focuses on solving the $K$-means clustering problem approximately with side information ...
A wide range of applications in engineering as well as the natural and social sciences have datasets...
Crowdsourcing utilizes human ability by distributing tasks to a large number of workers. It is espec...
International audienceIn correlation clustering, we are given $n$ objects together with a binary sim...
Clustering is inherently ill-posed: there often exist multiple valid clusterings of a single dataset...
International audienceActive learning for semi-supervised clustering allows algorithms to solicit a ...
Constraint-based clustering leverages user-provided constraints to produce a clustering that matches...
We consider the problem of crowdsourced clustering of a set of items based on queries of the similar...
Due to the widespread use and importance of crowdsourcing in gathering training data at scale, the d...
We study the problem of frequent itemset mining in domains where data is not recorded in a conventio...
This work studies clustering algorithms which operates with ordinal or comparison-based queries (ope...
© Springer Nature Switzerland AG 2018. Constraint-based clustering algorithms exploit background kno...
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for a...
We study the problem of frequent itemset mining in domains where data is not recorded in a conventio...
We consider the following problem: given a set of clusterings, find a single clustering that agrees ...