Igbo is a low-resource language spoken by approximately 30 million people worldwide. It is the native language of the Igbo people of south-eastern Nigeria. In Igbo language, diacritics - orthographic and tonal - play a huge role in the distinguishing the meaning and pronunciation of words. Omitting diacritics in texts often leads to lexical ambiguity. Diacritic restoration is a pre-processing task that replaces missing diacritics on words from which they have been removed. In this work, we applied embedding models to the diacritic restoration task and compared their performances to those of n-gram models. Although word embedding models have been successfully applied to various NLP tasks, it has not been used, to our knowledge, for diacritic...
This project aims to develop linguistic resources to support computational NLP research on the Igbo...
Diacritic Restoration is a necessity in the processing of languages with Latinbased scripts that uti...
Online ISSN: 2335-884X. http://itc.ktu.lt/index.php/ITC/article/view/18066In this research we compar...
Igbo is a low-resource language spoken by approximately 30 million people worldwide. It is the nativ...
Igbo is a low-resource African language with orthographic and tonal diacritics, which capture distin...
With natural language processing (NLP), researchers aim to get the computer to identify and understa...
Properly written texts in Igbo, a low resource African language, are rich in both orthographic and...
NLP research on low resource African languages is often impeded by the unavailability of basic resou...
Existing NLP models are mostly trained with data from well-resourced languages. Most minority langua...
The orthography of many resource-scarce languages includes diacritically marked characters. Falling ...
NLP research on low resource African languages is often impeded by the unavailability of basic resou...
Abstract. The orthography of many resource-scarce languages includes diacritically marked characters...
Corpus of texts in 12 languages. For each language, we provide one training, one development and one...
Statistical language models are utilized in many speech processing algorithms, e.g., automatic speec...
Natural Language Processing (NLP) research is still in its infancy in Africa. Most of languages in ...
This project aims to develop linguistic resources to support computational NLP research on the Igbo...
Diacritic Restoration is a necessity in the processing of languages with Latinbased scripts that uti...
Online ISSN: 2335-884X. http://itc.ktu.lt/index.php/ITC/article/view/18066In this research we compar...
Igbo is a low-resource language spoken by approximately 30 million people worldwide. It is the nativ...
Igbo is a low-resource African language with orthographic and tonal diacritics, which capture distin...
With natural language processing (NLP), researchers aim to get the computer to identify and understa...
Properly written texts in Igbo, a low resource African language, are rich in both orthographic and...
NLP research on low resource African languages is often impeded by the unavailability of basic resou...
Existing NLP models are mostly trained with data from well-resourced languages. Most minority langua...
The orthography of many resource-scarce languages includes diacritically marked characters. Falling ...
NLP research on low resource African languages is often impeded by the unavailability of basic resou...
Abstract. The orthography of many resource-scarce languages includes diacritically marked characters...
Corpus of texts in 12 languages. For each language, we provide one training, one development and one...
Statistical language models are utilized in many speech processing algorithms, e.g., automatic speec...
Natural Language Processing (NLP) research is still in its infancy in Africa. Most of languages in ...
This project aims to develop linguistic resources to support computational NLP research on the Igbo...
Diacritic Restoration is a necessity in the processing of languages with Latinbased scripts that uti...
Online ISSN: 2335-884X. http://itc.ktu.lt/index.php/ITC/article/view/18066In this research we compar...