The article presents a new language learner corpus for Swedish, SweLL, and the methodology from collection and pesudonymisation to protect personal information of learners to annotation adapted to second language learning. The main aim is to deliver a well-annotated corpus of essays written by second language learners of Swedish and make it available for research through a browsable environment. To that end, a new annotation tool and a new project management tool have been implemented, both with the main purpose to ensure reliability and quality of the final corpus. In the article we discuss reasoning behind metadata selection, principles of gold corpus compilation and argue for separation of normalization from correction annotation.Special...
Ontology learning from text generally consists roughly of NLP, knowledge extraction and ontology con...
This article illustrates the grammatical and error annotations of a morphologically rich learner lan...
The Swedish Treebank is a syntactically annotated corpus. The annotation includes word and sentence ...
The article presents a new language learner corpus for Swedish, SweLL, and the methodology from coll...
The article presents a new language learner corpus for Swedish, SweLL, and the methodology from coll...
This paper reports on the status of learner corpus anonymization for the ongoing research infrastruc...
We present de-identification and pseudonymization of a learner corpus within the ongoing research in...
In this master thesis the focus has been made on the evaluation of Stockholm Umeå Corpus (SUC) as a ...
Corpora for second language (L2) learning may contain a receptive vocabulary, i.e., vocabulary that ...
This paper aims to present part of the project “From Speech to Sign – learning Swedish Sign Language...
This paper presents a new lexical resource for learners of Swedish as a second language, SweLLex, an...
In this paper we present a dataset of contemporary Swedish containing one billion words. The dataset...
This article discusses questions concerning the creation, annotation and sharing of spoken language ...
This volume presents findings from research on the development of corpus linguistics in Sweden as a ...
Computer learner corpus (CLC) research is still in its infancy. With roots both in corpus linguistic...
Ontology learning from text generally consists roughly of NLP, knowledge extraction and ontology con...
This article illustrates the grammatical and error annotations of a morphologically rich learner lan...
The Swedish Treebank is a syntactically annotated corpus. The annotation includes word and sentence ...
The article presents a new language learner corpus for Swedish, SweLL, and the methodology from coll...
The article presents a new language learner corpus for Swedish, SweLL, and the methodology from coll...
This paper reports on the status of learner corpus anonymization for the ongoing research infrastruc...
We present de-identification and pseudonymization of a learner corpus within the ongoing research in...
In this master thesis the focus has been made on the evaluation of Stockholm Umeå Corpus (SUC) as a ...
Corpora for second language (L2) learning may contain a receptive vocabulary, i.e., vocabulary that ...
This paper aims to present part of the project “From Speech to Sign – learning Swedish Sign Language...
This paper presents a new lexical resource for learners of Swedish as a second language, SweLLex, an...
In this paper we present a dataset of contemporary Swedish containing one billion words. The dataset...
This article discusses questions concerning the creation, annotation and sharing of spoken language ...
This volume presents findings from research on the development of corpus linguistics in Sweden as a ...
Computer learner corpus (CLC) research is still in its infancy. With roots both in corpus linguistic...
Ontology learning from text generally consists roughly of NLP, knowledge extraction and ontology con...
This article illustrates the grammatical and error annotations of a morphologically rich learner lan...
The Swedish Treebank is a syntactically annotated corpus. The annotation includes word and sentence ...