We investigate how disagreement in natural language inference (NLI) annotation arises. We developed a taxonomy of disagreement sources with 10 categories spanning 3 highlevel classes. We found that some disagreements are due to uncertainty in the sentence meaning, others to annotator biases and task artifacts, leading to different interpretations of the label distribution. We explore two modeling approaches for detecting items with potential disagreement: a 4-way classification with a ‘‘Complicated’’ label in addition to the three standard NLI labels, and a multilabel classification approach. We found that the multilabel classification is more expressive and gives better recall of the possible interpretations in the data
Semantic annotation tasks contain ambiguity and vagueness and require varying degrees of world knowl...
This work describes an analysis of inter-annotator disagreements in human evaluation of machine tran...
| openaire: EC/H2020/101016775/EU//INTERVENEExperts and crowds can work together to generate high-qu...
We investigate how disagreement in natural language inference (NLI) annotation arises. We developed ...
We investigate how disagreement in natural language inference (NLI) annotation arises. We developed ...
Natural language inference (NLI) is the task of determining whether a piece of text is entailed, con...
In NLP annotation, it is common to have multiple annotators label the text and then obtain the groun...
International audienceLinguistic annotation underlies many successful approaches in Natural Language...
Supervised learning assumes that a ground truth label exists. However, the reliability of this groun...
Disagreement between coders is ubiquitous in virtually all datasets annotated with human judgements ...
This paper describes a methodology for supporting the task of annotating sentiment in natural langua...
Crowdsourced data are often rife with disagreement, either because of genuine item ambiguity, overla...
<p>The focus of this paper is on how events can be detected & extracted from natural language text, ...
For a highly subjective task such as recognising speaker intention and argumentation, the traditiona...
Many tasks in Natural Language Processing (nlp) and Computer Vision (cv) offer evidence that humans ...
Semantic annotation tasks contain ambiguity and vagueness and require varying degrees of world knowl...
This work describes an analysis of inter-annotator disagreements in human evaluation of machine tran...
| openaire: EC/H2020/101016775/EU//INTERVENEExperts and crowds can work together to generate high-qu...
We investigate how disagreement in natural language inference (NLI) annotation arises. We developed ...
We investigate how disagreement in natural language inference (NLI) annotation arises. We developed ...
Natural language inference (NLI) is the task of determining whether a piece of text is entailed, con...
In NLP annotation, it is common to have multiple annotators label the text and then obtain the groun...
International audienceLinguistic annotation underlies many successful approaches in Natural Language...
Supervised learning assumes that a ground truth label exists. However, the reliability of this groun...
Disagreement between coders is ubiquitous in virtually all datasets annotated with human judgements ...
This paper describes a methodology for supporting the task of annotating sentiment in natural langua...
Crowdsourced data are often rife with disagreement, either because of genuine item ambiguity, overla...
<p>The focus of this paper is on how events can be detected & extracted from natural language text, ...
For a highly subjective task such as recognising speaker intention and argumentation, the traditiona...
Many tasks in Natural Language Processing (nlp) and Computer Vision (cv) offer evidence that humans ...
Semantic annotation tasks contain ambiguity and vagueness and require varying degrees of world knowl...
This work describes an analysis of inter-annotator disagreements in human evaluation of machine tran...
| openaire: EC/H2020/101016775/EU//INTERVENEExperts and crowds can work together to generate high-qu...