This paper presents an effort to provide a level-appropriate study corpus for Lithuanian language learners. The collected corpus includes levelled texts from study books and unlevelled texts from other sources. The main goal is to assign the level-appropriate labels (A1, A2, B1, B2) to texts from other sources. For automatic classification we use preselected surface features, based on text readability research, and shallow linguistic features. First, we train the model with levelled texts from study books; second, we apply the learned model to classifying other texts. The best classification results are achieved with Logistic Regression method
In this paper, we present first results of training a classifier for discriminating Russian texts in...
In this paper we present an algorithm that detects, recognizes and annotates sentence boundaries and...
We report an ongoing study on quantitative characteristics of texts written in different genres. At ...
Knygos ISBN 978-1-61499-912-6 (online)This paper presents an effort to provide a level-appropriate s...
The paper aims to present the first pedagogic corpus of Lithuanian i.e. monolingual specialized corp...
The article presents a new resource for A2-B2 learners of Lithuanian as L2 to improve their lexical ...
This paper discusses the problem of automatic CEFR (CEFR – Common European Framework of Reference fo...
In this paper we present the results of an automatic classification of Russian texts into three leve...
This paper describes our research on statistical language modeling of Lithuanian. The idea of improv...
The aim of this paper is to present the design of the Lithuanian Learner Corpus (LLC), the initial s...
Abstract. This paper presents document comparison and classification model for Lithuanian language t...
"Classifying Estonian Web texts" Due to the size of the Internet and the multitude of traditional...
Translated text has certain features which mark it as such, which can be identified using statistica...
The aim of this research is to explore the productive vocabulary that beginner learners of Lithuania...
This paper discusses research on Lithuanian texts of different styles for the development of the met...
In this paper, we present first results of training a classifier for discriminating Russian texts in...
In this paper we present an algorithm that detects, recognizes and annotates sentence boundaries and...
We report an ongoing study on quantitative characteristics of texts written in different genres. At ...
Knygos ISBN 978-1-61499-912-6 (online)This paper presents an effort to provide a level-appropriate s...
The paper aims to present the first pedagogic corpus of Lithuanian i.e. monolingual specialized corp...
The article presents a new resource for A2-B2 learners of Lithuanian as L2 to improve their lexical ...
This paper discusses the problem of automatic CEFR (CEFR – Common European Framework of Reference fo...
In this paper we present the results of an automatic classification of Russian texts into three leve...
This paper describes our research on statistical language modeling of Lithuanian. The idea of improv...
The aim of this paper is to present the design of the Lithuanian Learner Corpus (LLC), the initial s...
Abstract. This paper presents document comparison and classification model for Lithuanian language t...
"Classifying Estonian Web texts" Due to the size of the Internet and the multitude of traditional...
Translated text has certain features which mark it as such, which can be identified using statistica...
The aim of this research is to explore the productive vocabulary that beginner learners of Lithuania...
This paper discusses research on Lithuanian texts of different styles for the development of the met...
In this paper, we present first results of training a classifier for discriminating Russian texts in...
In this paper we present an algorithm that detects, recognizes and annotates sentence boundaries and...
We report an ongoing study on quantitative characteristics of texts written in different genres. At ...