Language models are the foundation of current neural network-based models for natural language understanding and generation. However, research on the intrinsic performance of language models on African languages has been extremely limited, and is made more challenging by the lack of large or standardised training and evaluation sets that exist for English and other high-resource languages. In this paper, we evaluate the performance of open-vocabulary language models on low-resource South African languages, using byte-pair encoding to handle the rich morphology of these languages. We evaluate different variants of n-gram models, feedforward neural networks, recurrent neural networks (RNNs), and Transformers on small-scale datasets. Overall, ...
Subwords have become the standard units of text in NLP, enabling efficient open-vocabulary models. W...
For resource rich languages, recent works have shown Neu-ral Network based Language Models (NNLMs) t...
International audienceMultilingual transformer models like mBERT and XLM-RoBERTa have obtained great...
Over the past five years neural network models have been successful across a range of computational ...
Thesis (MSc)--Stellenbosch University, 2021.ENGLISH ABSTRACT: The majority of African languages have...
There are over 7000 languages spoken on earth, but many of these languages suffer from a dearth of n...
Almost none of the 2,000+ languages spoken in Africa have widely available automatic speech recognit...
The paper describes the University of Cape Town's submission to the constrained track of the WMT22 S...
The paper describes the University of Cape Town's submission to the constrained track of the WMT22 S...
This paper explores state-of-the-art techniques for creating language models in low-resource setting...
<p>For resource rich languages, recent works have shown Neural Network based Language Models (NNLMs)...
Neural Machine Translation (NMT) models have achieved remarkable performance on translating between ...
For resource rich languages, recent works have shown Neural Network based Language Models (NNLMs) to...
Mini Dissertation (MIT (Big Data Science))--University of Pretoria, 2023.It was researched whether a...
Subwords have become the standard units of text in NLP, enabling efficient open-vocabulary models. W...
Subwords have become the standard units of text in NLP, enabling efficient open-vocabulary models. W...
For resource rich languages, recent works have shown Neu-ral Network based Language Models (NNLMs) t...
International audienceMultilingual transformer models like mBERT and XLM-RoBERTa have obtained great...
Over the past five years neural network models have been successful across a range of computational ...
Thesis (MSc)--Stellenbosch University, 2021.ENGLISH ABSTRACT: The majority of African languages have...
There are over 7000 languages spoken on earth, but many of these languages suffer from a dearth of n...
Almost none of the 2,000+ languages spoken in Africa have widely available automatic speech recognit...
The paper describes the University of Cape Town's submission to the constrained track of the WMT22 S...
The paper describes the University of Cape Town's submission to the constrained track of the WMT22 S...
This paper explores state-of-the-art techniques for creating language models in low-resource setting...
<p>For resource rich languages, recent works have shown Neural Network based Language Models (NNLMs)...
Neural Machine Translation (NMT) models have achieved remarkable performance on translating between ...
For resource rich languages, recent works have shown Neural Network based Language Models (NNLMs) to...
Mini Dissertation (MIT (Big Data Science))--University of Pretoria, 2023.It was researched whether a...
Subwords have become the standard units of text in NLP, enabling efficient open-vocabulary models. W...
Subwords have become the standard units of text in NLP, enabling efficient open-vocabulary models. W...
For resource rich languages, recent works have shown Neu-ral Network based Language Models (NNLMs) t...
International audienceMultilingual transformer models like mBERT and XLM-RoBERTa have obtained great...