The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspapers articles. The text contains alphabetic, numeric and symbolic words. The existence of numeric and symbolic words in this dataset could tell the efficiency and robustness of many Arabic text classification and indexing documents. The dataset consists of 111,728 documents (cf. Table 1) and 319,254,124 words (cf. Table 2) structured in text files, and collected from 3 Arabic online newspapers: Assabah [9], Hespress [10] and Akhbarona [11] using semi-automatic web crawling process. The documents in the dataset are categorized into 5 classes: sport, politic, culture, economy and diverse. The number of documents and words for each class varies fro...
This dataset is a relatively great size collection of Arabic news tweets that were collected from an...
Language Engineering, including Information Retrieval, Machine Translation and other Natural Languag...
Today, text categorization is usually used in various areas, such as: information retrieval, data mi...
The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspapers ...
The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspapers ...
The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspapers ...
SANAD Dataset is a large collection of Arabic news articles that can be used in different Arabic NLP...
SANAD Dataset is a large collection of Arabic news articles that can be used in different Arabic NLP...
SANAD Dataset is a large collection of Arabic news articles that can be used in different Arabic NLP...
There is a huge content of Arabic text available over online that requires an organization of these ...
SANAD Dataset is a large collection of Arabic news articles that can be used in different Arabic NLP...
There is a huge content of Arabic text available over online that requires an organization of these ...
An enormous amount of valuable human knowledge is preserved in documents. The rapid growth in the nu...
PAAD: Political Arabic Article Dataset is a collection of political Arabic text, which covers modern...
There is a huge content of Arabic text available over online that requires an organization of these ...
This dataset is a relatively great size collection of Arabic news tweets that were collected from an...
Language Engineering, including Information Retrieval, Machine Translation and other Natural Languag...
Today, text categorization is usually used in various areas, such as: information retrieval, data mi...
The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspapers ...
The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspapers ...
The dataset is a collection of Arabic texts, which covers modern Arabic language used in newspapers ...
SANAD Dataset is a large collection of Arabic news articles that can be used in different Arabic NLP...
SANAD Dataset is a large collection of Arabic news articles that can be used in different Arabic NLP...
SANAD Dataset is a large collection of Arabic news articles that can be used in different Arabic NLP...
There is a huge content of Arabic text available over online that requires an organization of these ...
SANAD Dataset is a large collection of Arabic news articles that can be used in different Arabic NLP...
There is a huge content of Arabic text available over online that requires an organization of these ...
An enormous amount of valuable human knowledge is preserved in documents. The rapid growth in the nu...
PAAD: Political Arabic Article Dataset is a collection of political Arabic text, which covers modern...
There is a huge content of Arabic text available over online that requires an organization of these ...
This dataset is a relatively great size collection of Arabic news tweets that were collected from an...
Language Engineering, including Information Retrieval, Machine Translation and other Natural Languag...
Today, text categorization is usually used in various areas, such as: information retrieval, data mi...