Crowd sequential annotations can be an efficient and cost-effective way to build large datasets for sequence labeling. Different from tagging independent instances, for crowd sequential annotations the quality of label sequence relies on the expertise level of annotators in capturing internal dependencies for each token in the sequence. In this paper, we propose Modeling sequential annotation for sequence labeling with crowds (SA-SLC). First, a conditional probabilistic model is developed to jointly model sequential data and annotators' expertise, in which categorical distribution is introduced to estimate the reliability of each annotator in capturing local and non-local label dependency for sequential annotation. To accelerate the margina...
Distributing labeling tasks among hundreds or thousands of annotators is an increasingly important m...
Labeling large datasets has become faster, cheaper, and easier with the advent of crowdsourcing ser...
Real-world data for classification is often labeled by multiple annotators. For analyzing such data,...
Crowdsourcing is a popular cheap alternative in machine learning for gathering information from a se...
Crowdsourcing marketplaces are widely used for curating large annotated datasets by col-lecting labe...
Machine learning applications can benefit greatly from vast amounts of data, provided that reliable ...
The labeling process within a supervised learning task is usually carried out by an expert, which pr...
a b s t r a c t With the increasing popularity of online crowdsourcing platforms such as Amazon Mech...
Most models used in natural language processing must be trained on large corpora of labeled text. Th...
Crowdsourcing lets us collect multiple annotations for an item from several annotators. Typically, t...
Crowdsourcing is widely used nowadays in machine learning for data labeling. Although in the traditi...
Existing partial sequence labeling models mainly focus on max-margin framework which fails to provid...
For annotation tasks involving independent judgments, probabilistic models have been used to infer g...
With the increasing popularity of online crowdsourcing platforms such as Amazon Mechanical Turk (AMT...
Current methods for sequence tagging depend on large quantities of domain-specific training data, li...
Distributing labeling tasks among hundreds or thousands of annotators is an increasingly important m...
Labeling large datasets has become faster, cheaper, and easier with the advent of crowdsourcing ser...
Real-world data for classification is often labeled by multiple annotators. For analyzing such data,...
Crowdsourcing is a popular cheap alternative in machine learning for gathering information from a se...
Crowdsourcing marketplaces are widely used for curating large annotated datasets by col-lecting labe...
Machine learning applications can benefit greatly from vast amounts of data, provided that reliable ...
The labeling process within a supervised learning task is usually carried out by an expert, which pr...
a b s t r a c t With the increasing popularity of online crowdsourcing platforms such as Amazon Mech...
Most models used in natural language processing must be trained on large corpora of labeled text. Th...
Crowdsourcing lets us collect multiple annotations for an item from several annotators. Typically, t...
Crowdsourcing is widely used nowadays in machine learning for data labeling. Although in the traditi...
Existing partial sequence labeling models mainly focus on max-margin framework which fails to provid...
For annotation tasks involving independent judgments, probabilistic models have been used to infer g...
With the increasing popularity of online crowdsourcing platforms such as Amazon Mechanical Turk (AMT...
Current methods for sequence tagging depend on large quantities of domain-specific training data, li...
Distributing labeling tasks among hundreds or thousands of annotators is an increasingly important m...
Labeling large datasets has become faster, cheaper, and easier with the advent of crowdsourcing ser...
Real-world data for classification is often labeled by multiple annotators. For analyzing such data,...