Kurdish is a less-resourced language consisting of different dialects written in various scripts. Approximately 30 million people in different countries speak the language. The lack of corpora is one of the main obstacles in Kurdish language processing. In this paper, we present KTC the Kurdish Textbooks Corpus, which is composed of 31 K-12 textbooks in Sorani dialect. The corpus is normalized and categorized into 12 educational subjects containing 693,800 tokens (110,297 types). Our resource is publicly available for non-commercial use under the CC BY-NC-SA 4.0 license.We would like to appreciate the generous assistance of the Ministry of Education of the Kurdistan Region of Iraq, particularly the General Directorate of Curriculum and Prin...
This paper describes the development of lexicographic resources for Kurdish and provides a lexical m...
With the rapid growth of Kurdish language content on the web, there is a high demand for making this...
The article discusses language corpuses, the history of their creation, requirements for creating a ...
One of the major challenges that underrepresented and endangered language communities face in langua...
The Kurdish language is regarded as one of the less-resourced languages. The language is globally p...
In this paper, we describe a general methodology for developing a large-scale lexicon for a less-res...
The text file contains over 17 million Kurdish Sorani texts. The Kurdish text corpus was collected f...
Data DescriptionThe Kurdish language belongs to the Indo-Iranian family of Indo-European languages. ...
While the computational processing of Kurdish has experienced a relative increase, the machine trans...
The aim of this thesis is to study some of the challenges that the Kurdish language and its standard...
This paper describes the development of lexicographic resources for Kurdish and provides a lexical ...
In computer linguistics, the corpus is important, especially the corpus of texts is used in the crea...
The Kurdish Character dataset is a comprehensive collection of the Kurdish alphabet, providing detai...
Modern Standard Kurdish (MSK) is a written form of Kurdish adopted by the Iraqi Kurds to establish a...
In many countries around the world, determining what languages or dialects should be the instrument ...
This paper describes the development of lexicographic resources for Kurdish and provides a lexical m...
With the rapid growth of Kurdish language content on the web, there is a high demand for making this...
The article discusses language corpuses, the history of their creation, requirements for creating a ...
One of the major challenges that underrepresented and endangered language communities face in langua...
The Kurdish language is regarded as one of the less-resourced languages. The language is globally p...
In this paper, we describe a general methodology for developing a large-scale lexicon for a less-res...
The text file contains over 17 million Kurdish Sorani texts. The Kurdish text corpus was collected f...
Data DescriptionThe Kurdish language belongs to the Indo-Iranian family of Indo-European languages. ...
While the computational processing of Kurdish has experienced a relative increase, the machine trans...
The aim of this thesis is to study some of the challenges that the Kurdish language and its standard...
This paper describes the development of lexicographic resources for Kurdish and provides a lexical ...
In computer linguistics, the corpus is important, especially the corpus of texts is used in the crea...
The Kurdish Character dataset is a comprehensive collection of the Kurdish alphabet, providing detai...
Modern Standard Kurdish (MSK) is a written form of Kurdish adopted by the Iraqi Kurds to establish a...
In many countries around the world, determining what languages or dialects should be the instrument ...
This paper describes the development of lexicographic resources for Kurdish and provides a lexical m...
With the rapid growth of Kurdish language content on the web, there is a high demand for making this...
The article discusses language corpuses, the history of their creation, requirements for creating a ...