Annotated data is an essential ingredient in natural language processing for training and evaluating machine learning models. It is therefore very desirable for the annotations to be of high quality. Recent work, however, has shown that several popular datasets contain a surprising amount of annotation errors or inconsistencies. To alleviate this issue, many methods for annotation error detection have been devised over the years. While researchers show that their approaches work well on their newly introduced datasets, they rarely compare their methods to previous work or on the same datasets. This raises strong concerns on methods’ general performance and makes it difficult to asses their strengths and weaknesses. We therefore reimplement ...
International audienceIn this paper, we address the question of automatic annotation of English lear...
International audienceIn this paper, we address the question of automatic annotation of English lear...
The paper describes a corpus of texts produced by non-native speakersof Czech. We discuss its annota...
This is the accompanying data for our paper "Annotation Error Detection: Analyzing the Past and Pres...
We introduce a method for error detection in automatically annotated text, aimed at supporting the c...
This paper describes a methodology for supporting the task of annotating sentiment in natural langua...
In this thesis, we investigate methods for automatic detection, and to some extent correction, of gr...
We develop a method for detecting errors in semantic predicate-argument annotation, based on the var...
While automatically computing numerical scores remains the dominant paradigm in NLP system evaluatio...
While automatically computing numerical scores remains the dominant paradigm in NLP system evaluatio...
This paper describes a statistical approach to detect annotation errors in dependency treebanks. As ...
This is the accompanying data for the paper "Analyzing Dataset Annotation Quality Management in the...
International audienceIn this paper, we address the question of automatic annotation of English lear...
International audienceIn this paper, we address the question of automatic annotation of English lear...
International audienceIn this paper, we address the question of automatic annotation of English lear...
International audienceIn this paper, we address the question of automatic annotation of English lear...
International audienceIn this paper, we address the question of automatic annotation of English lear...
The paper describes a corpus of texts produced by non-native speakersof Czech. We discuss its annota...
This is the accompanying data for our paper "Annotation Error Detection: Analyzing the Past and Pres...
We introduce a method for error detection in automatically annotated text, aimed at supporting the c...
This paper describes a methodology for supporting the task of annotating sentiment in natural langua...
In this thesis, we investigate methods for automatic detection, and to some extent correction, of gr...
We develop a method for detecting errors in semantic predicate-argument annotation, based on the var...
While automatically computing numerical scores remains the dominant paradigm in NLP system evaluatio...
While automatically computing numerical scores remains the dominant paradigm in NLP system evaluatio...
This paper describes a statistical approach to detect annotation errors in dependency treebanks. As ...
This is the accompanying data for the paper "Analyzing Dataset Annotation Quality Management in the...
International audienceIn this paper, we address the question of automatic annotation of English lear...
International audienceIn this paper, we address the question of automatic annotation of English lear...
International audienceIn this paper, we address the question of automatic annotation of English lear...
International audienceIn this paper, we address the question of automatic annotation of English lear...
International audienceIn this paper, we address the question of automatic annotation of English lear...
The paper describes a corpus of texts produced by non-native speakersof Czech. We discuss its annota...