We present the general architecture of the error annotation system applied to the COPLE2 corpus, a learner corpus of Portuguese implemented on the TEITOK platform. We give a general overview of the corpus and of the TEITOK functionalities and describe how the error annotation is structured in a two-level system: first, a fully manual token-based and coarse-grained annotation is applied and produces a rough classification of the errors in three categories, paired with multi-level information for POS and lemma; second, a multi-word and fine-grained annotation in standoff is then semi-automatically produced based on the first level of annotation. The token-based level has been applied to 47% of the total corpus. We compare our system with othe...
The Corpus of Portuguese Undergraduates' Texts (CUTe) is an error-tagged learner corpus of Portugues...
Annotating a corpus with error information is a challenging task. This paper describes the design, e...
The paper describes a corpus of texts produced by non-native speakersof Czech. We discuss its annota...
We present the general architecture of the error annotation system applied to the COPLE2 corpus, a l...
We present the error tagging system of the COPLE2 corpus and the first results of its implementation...
In this article, we present COPLE2, a new corpus of Portuguese that encompasses written and spoken d...
We present the COPLE2 corpus, a learner corpus of Portuguese that includes written and spoken texts ...
International audienceIn this paper, we address the question of automatic annotation of English lear...
We present a freely available corpus containing source language texts from different domains along w...
Error coding of second-language learner text, that is, detecting, correcting and annotating errors, ...
This paper is a work-in-progress report on error annotation in the Lithuanian Learner Corpus (LLC), ...
In this thesis, we investigate methods for automatic detection, and to some extent correction, of gr...
The corpus contains 2094 texts from the corpus Šolar 2.0 (http://hdl.handle.net/11356/1214), i.e. on...
This paper proposes a model for the design of interlanguage corpus with error analysis annotation. T...
Title: Evaluation of Error Mark-Up in a Learner Corpus of Czech Author: Barbora Štindlová Department...
The Corpus of Portuguese Undergraduates' Texts (CUTe) is an error-tagged learner corpus of Portugues...
Annotating a corpus with error information is a challenging task. This paper describes the design, e...
The paper describes a corpus of texts produced by non-native speakersof Czech. We discuss its annota...
We present the general architecture of the error annotation system applied to the COPLE2 corpus, a l...
We present the error tagging system of the COPLE2 corpus and the first results of its implementation...
In this article, we present COPLE2, a new corpus of Portuguese that encompasses written and spoken d...
We present the COPLE2 corpus, a learner corpus of Portuguese that includes written and spoken texts ...
International audienceIn this paper, we address the question of automatic annotation of English lear...
We present a freely available corpus containing source language texts from different domains along w...
Error coding of second-language learner text, that is, detecting, correcting and annotating errors, ...
This paper is a work-in-progress report on error annotation in the Lithuanian Learner Corpus (LLC), ...
In this thesis, we investigate methods for automatic detection, and to some extent correction, of gr...
The corpus contains 2094 texts from the corpus Šolar 2.0 (http://hdl.handle.net/11356/1214), i.e. on...
This paper proposes a model for the design of interlanguage corpus with error analysis annotation. T...
Title: Evaluation of Error Mark-Up in a Learner Corpus of Czech Author: Barbora Štindlová Department...
The Corpus of Portuguese Undergraduates' Texts (CUTe) is an error-tagged learner corpus of Portugues...
Annotating a corpus with error information is a challenging task. This paper describes the design, e...
The paper describes a corpus of texts produced by non-native speakersof Czech. We discuss its annota...