We investigated the use of supervised learning methods that use labels from crowd workers to resolve entities. Although obtaining labeled data by crowdsourcing can reduce time and cost, it also brings challenges (e.g., coping with the variable quality of crowdgenerated data). First, we evaluated the quality of crowd-generated labels for actual entity resolution data sets. Then, we evaluated the prediction accuracy of two machine learning methods that use labels from crowd workers: a conventional LPP method using consensus labels obtained by majority voting and our proposed method that combines multiple Laplacians directly by using crowdsourced data. We discussed the relationship between the accuracy of workers’ labels and the prediction acc...
In supervised learning - for instance in image classification - modern massive datasets are commonly...
We study the problem of enhancing Entity Resolution (ER) with the help of crowdsourcing. ER is the p...
Crowdsourcing has become an effective and popular tool for human-powered computation to label large ...
With crowdsourcing systems, labels can be obtained with low cost, which facilitates the creation of ...
Received xxxxxxxx xx, xxxx; accepted xxxxxxxx xx, xxxx Abstract Crowdsourcing has been an effective ...
Although supervised learning requires a labeled dataset, obtaining labels from experts is generally ...
Crowdsourcing is widely used nowadays in machine learning for data labeling. Although in the traditi...
Current quality control methods for crowdsourcing largely account for variations in worker responses...
There are several computational tasks for which the help of people is useful. One such task is entit...
Although supervised learning requires a labeled dataset, ob- taining labels from experts is generall...
Entity resolution (ER) is the task of identifying all records in a database that refer to the same u...
Project Specification Crowdsourcing is gaining popularity in academia with the launch of crowdsourc...
© 2019 Dr. Yuan LiThis thesis explores aggregation methods for crowdsourced annotations. Crowdsourci...
Collecting labels for data is important for many practical applications (e.g., data mining). However...
We propose novel algorithms for the problem of crowd- sourcing binary labels. Such binary labeling t...
In supervised learning - for instance in image classification - modern massive datasets are commonly...
We study the problem of enhancing Entity Resolution (ER) with the help of crowdsourcing. ER is the p...
Crowdsourcing has become an effective and popular tool for human-powered computation to label large ...
With crowdsourcing systems, labels can be obtained with low cost, which facilitates the creation of ...
Received xxxxxxxx xx, xxxx; accepted xxxxxxxx xx, xxxx Abstract Crowdsourcing has been an effective ...
Although supervised learning requires a labeled dataset, obtaining labels from experts is generally ...
Crowdsourcing is widely used nowadays in machine learning for data labeling. Although in the traditi...
Current quality control methods for crowdsourcing largely account for variations in worker responses...
There are several computational tasks for which the help of people is useful. One such task is entit...
Although supervised learning requires a labeled dataset, ob- taining labels from experts is generall...
Entity resolution (ER) is the task of identifying all records in a database that refer to the same u...
Project Specification Crowdsourcing is gaining popularity in academia with the launch of crowdsourc...
© 2019 Dr. Yuan LiThis thesis explores aggregation methods for crowdsourced annotations. Crowdsourci...
Collecting labels for data is important for many practical applications (e.g., data mining). However...
We propose novel algorithms for the problem of crowd- sourcing binary labels. Such binary labeling t...
In supervised learning - for instance in image classification - modern massive datasets are commonly...
We study the problem of enhancing Entity Resolution (ER) with the help of crowdsourcing. ER is the p...
Crowdsourcing has become an effective and popular tool for human-powered computation to label large ...