There is a growing body of work in recent years to develop pre-trained language models (PLMs) for the Arabic language. This work concerns addressing two major problems in existing Arabic PLMs which constraint progress of the Arabic NLU and NLG fields.First, existing Arabic PLMs are not well-explored and their pre-trainig can be improved significantly using a more methodical approach. Second, there is a lack of systematic and reproducible evaluation of these models in the literature. In this work, we revisit both the pre-training and evaluation of Arabic PLMs. In terms of pre-training, we explore improving Arabic LMs from three perspectives: quality of the pre-training data, size of the model, and incorporating character-level information. A...
As more and more Arabic texts emerged on the Internet, extracting important information from these A...
Recurrent Neural Networks (RNNs) and transformers are deep learning models that have achieved remark...
International audienceRecent impressive improvements in NLP, largely based on the success of context...
This paper addresses the classification of Arabic text data in the field of Natural Language Process...
Pretraining data The models were pretrained on ~4.4 Billion words: Arabic version of OSCAR (unsh...
Abstract: This work introduces the notion of a computational resource for organizing knowledge devel...
Arabic is a Semitic language which is widely spoken with many dialects. Given the success of pre-tra...
The use of multilingual language models for tasks in low and high-resource languages has been a succ...
Bidirectional Encoder Representations from Transformers (BERT) has gained increasing attention from ...
This work introduces the notion of a computational resource for organising knowledge developed for n...
We propose a Finite State Machine framework for Arabic Language Modeling. The framework provides sev...
he Arabic language has recently become the focus of an increasing number of projects in natural lang...
These are several Arabic Word Embedding Models for NLP tasks and it has been described in our paper ...
Artificial Neural Networks have proved their efficiency in a large number of research domains. In th...
Arabic Named Entity Recognition (ANER) systems aim to identify and classify Arabic Named entities (N...
As more and more Arabic texts emerged on the Internet, extracting important information from these A...
Recurrent Neural Networks (RNNs) and transformers are deep learning models that have achieved remark...
International audienceRecent impressive improvements in NLP, largely based on the success of context...
This paper addresses the classification of Arabic text data in the field of Natural Language Process...
Pretraining data The models were pretrained on ~4.4 Billion words: Arabic version of OSCAR (unsh...
Abstract: This work introduces the notion of a computational resource for organizing knowledge devel...
Arabic is a Semitic language which is widely spoken with many dialects. Given the success of pre-tra...
The use of multilingual language models for tasks in low and high-resource languages has been a succ...
Bidirectional Encoder Representations from Transformers (BERT) has gained increasing attention from ...
This work introduces the notion of a computational resource for organising knowledge developed for n...
We propose a Finite State Machine framework for Arabic Language Modeling. The framework provides sev...
he Arabic language has recently become the focus of an increasing number of projects in natural lang...
These are several Arabic Word Embedding Models for NLP tasks and it has been described in our paper ...
Artificial Neural Networks have proved their efficiency in a large number of research domains. In th...
Arabic Named Entity Recognition (ANER) systems aim to identify and classify Arabic Named entities (N...
As more and more Arabic texts emerged on the Internet, extracting important information from these A...
Recurrent Neural Networks (RNNs) and transformers are deep learning models that have achieved remark...
International audienceRecent impressive improvements in NLP, largely based on the success of context...