The paper describes a corpus of texts produced by non-native speakersof Czech. We discuss its annotation scheme, consisting of three interlinked tiers,designed to handle a wide range of error types present in the input. Each tier correctsdifferent types of errors; links between the tiers allow capturing errors in word orderand complex discontinuous expressions. Errors are not only corrected, but alsoclassified. The annotation scheme is tested on a data set including approx. 175,000words with fair inter-annotator agreement results. We also explore the possibility ofapplying automated linguistic annotation tools (taggers, spell checkers and grammarcheckers) to the learner text to support or even substitute manual annotation
Natural language correction, a subfield of natural language processing (NLP), is the task of automat...
The paper describes the learner corpus composed of English essays written by native Russian speakers...
The dataset was compiled to conduct the investigation of word order errors in texts written by forei...
Title: Evaluation of Error Mark-Up in a Learner Corpus of Czech Author: Barbora Štindlová Department...
Bachelor thesis "Possibilities of Error Annotation of Non-Native Speakers' Czech" compares annotatio...
We are developing and annotating a learner corpus of Hungarian, composed of student journals from th...
Error coding of second-language learner text, that is, detecting, correcting and annotating errors, ...
The aim of the thesis is to propose a tagging system for a learner corpus of spoken English which wo...
International audienceIn this paper, we address the question of automatic annotation of English lear...
We describe the creation of an annotation layer for word-based writing errors for a corpus of studen...
International audienceThis paper is a summary of our PhD thesis that presents the conception and the...
Learner corpora – principled collections of learner language – provide interesting insights into the...
This work describes a machine learning approach for checking the part-of-speech annotation, and pres...
In this thesis, we investigate methods for automatic detection, and to some extent correction, of gr...
A case study based on experience in linguistic investigations using annotated monolingual and multil...
Natural language correction, a subfield of natural language processing (NLP), is the task of automat...
The paper describes the learner corpus composed of English essays written by native Russian speakers...
The dataset was compiled to conduct the investigation of word order errors in texts written by forei...
Title: Evaluation of Error Mark-Up in a Learner Corpus of Czech Author: Barbora Štindlová Department...
Bachelor thesis "Possibilities of Error Annotation of Non-Native Speakers' Czech" compares annotatio...
We are developing and annotating a learner corpus of Hungarian, composed of student journals from th...
Error coding of second-language learner text, that is, detecting, correcting and annotating errors, ...
The aim of the thesis is to propose a tagging system for a learner corpus of spoken English which wo...
International audienceIn this paper, we address the question of automatic annotation of English lear...
We describe the creation of an annotation layer for word-based writing errors for a corpus of studen...
International audienceThis paper is a summary of our PhD thesis that presents the conception and the...
Learner corpora – principled collections of learner language – provide interesting insights into the...
This work describes a machine learning approach for checking the part-of-speech annotation, and pres...
In this thesis, we investigate methods for automatic detection, and to some extent correction, of gr...
A case study based on experience in linguistic investigations using annotated monolingual and multil...
Natural language correction, a subfield of natural language processing (NLP), is the task of automat...
The paper describes the learner corpus composed of English essays written by native Russian speakers...
The dataset was compiled to conduct the investigation of word order errors in texts written by forei...