Knygos ISBN 978-1-61499-912-6 (online)This paper presents an effort to provide a level-appropriate study corpus for Lithuanian language learners. The collected corpus includes levelled texts from study books and unlevelled texts from other sources. The main goal is to assign the level-appropriate labels (A1, A2, B1, B2) to texts from other sources. For automatic classification we use preselected surface features, based on text readability research, and shallow linguistic features. First, we train the model with levelled texts from study books; second, we apply the learned model to classifying other texts. The best classification results are achieved with Logistic Regression methodLituanistikos katedraUžsienio kalbų, lit. ir vert. s. katedra...
"Classifying Estonian Web texts" Due to the size of the Internet and the multitude of traditional...
Šio straipsnio tikslas – pristatyti ţvalgomąjį tyrimą, kuriame siekta išbandyti vartosenos modelių a...
Despite the existence of effective methods that solve named entity recognition tasks for such widely...
This paper presents an effort to provide a level-appropriate study corpus for Lithuanian language le...
The paper aims to present the first pedagogic corpus of Lithuanian i.e. monolingual specialized corp...
The article presents a new resource for A2-B2 learners of Lithuanian as L2 to improve their lexical ...
This paper discusses the problem of automatic CEFR (CEFR – Common European Framework of Reference fo...
This paper discusses research on Lithuanian texts of different styles for the development of the met...
Abstract. This paper presents document comparison and classification model for Lithuanian language t...
The aim of this paper is to present the design of the Lithuanian Learner Corpus (LLC), the initial s...
This paper describes our research on statistical language modeling of Lithuanian. The idea of improv...
In this paper we present the results of an automatic classification of Russian texts into three leve...
Translated text has certain features which mark it as such, which can be identified using statistica...
The aim of this research is to explore the productive vocabulary that beginner learners of Lithuania...
In this paper we present an algorithm that detects, recognizes and annotates sentence boundaries and...
"Classifying Estonian Web texts" Due to the size of the Internet and the multitude of traditional...
Šio straipsnio tikslas – pristatyti ţvalgomąjį tyrimą, kuriame siekta išbandyti vartosenos modelių a...
Despite the existence of effective methods that solve named entity recognition tasks for such widely...
This paper presents an effort to provide a level-appropriate study corpus for Lithuanian language le...
The paper aims to present the first pedagogic corpus of Lithuanian i.e. monolingual specialized corp...
The article presents a new resource for A2-B2 learners of Lithuanian as L2 to improve their lexical ...
This paper discusses the problem of automatic CEFR (CEFR – Common European Framework of Reference fo...
This paper discusses research on Lithuanian texts of different styles for the development of the met...
Abstract. This paper presents document comparison and classification model for Lithuanian language t...
The aim of this paper is to present the design of the Lithuanian Learner Corpus (LLC), the initial s...
This paper describes our research on statistical language modeling of Lithuanian. The idea of improv...
In this paper we present the results of an automatic classification of Russian texts into three leve...
Translated text has certain features which mark it as such, which can be identified using statistica...
The aim of this research is to explore the productive vocabulary that beginner learners of Lithuania...
In this paper we present an algorithm that detects, recognizes and annotates sentence boundaries and...
"Classifying Estonian Web texts" Due to the size of the Internet and the multitude of traditional...
Šio straipsnio tikslas – pristatyti ţvalgomąjį tyrimą, kuriame siekta išbandyti vartosenos modelių a...
Despite the existence of effective methods that solve named entity recognition tasks for such widely...