Text classification is the task of assigning a document to one or more of pre-defined categories based on its contents. This paper presents the results of classifying Arabic language documents by applying the KNN classifier, one time by using N-Gram namely unigrams and bigrams in documents indexing, and another time by using traditional single terms indexing method (bag of words) which supposes that the terms in the text are mutually independent which is not the case. Results show that using N-Grams produces better accuracy than using Single Terms for indexing; the average accuracy of using N-grams is.7357, while with Single terms indexing the average accuracy is.6688
Preprocessing is one of the main components in a conventional document categorization (DC) framework...
Preprocessing is one of the main components in a conventional document categorization (DC) framework...
Abstract: Compared to other languages, there is still a limited body of research which has been cond...
With the tremendous amount of electronic documents available, there is a great need to classify docu...
This project presents an implementation of automatic KNN Arabic text categorizer. Six hundred Arabic...
Today, text categorization is usually used in various areas, such as: information retrieval, data mi...
The quantity of text information published in Arabic language on the net requires the implementatio...
There is a huge content of Arabic text available over online that requires an organization of these ...
With growing texts of electronic documents used in many applications, a fast and accurate text class...
There is a huge content of Arabic text available over online that requires an organization of these ...
Abstract-Document categorization is an important topic that is central to many applications that dem...
Dimensionality reduction is an essential task for many large-scale information processing problems s...
In recent years, a lot of algorithms have been proposed for the classification of the documents. Mos...
Text Categorization is a technique for assigning documents based on their contents to one or more pr...
International audienceThis paper focuses on studying topic identificationfor Arabic language by usin...
Preprocessing is one of the main components in a conventional document categorization (DC) framework...
Preprocessing is one of the main components in a conventional document categorization (DC) framework...
Abstract: Compared to other languages, there is still a limited body of research which has been cond...
With the tremendous amount of electronic documents available, there is a great need to classify docu...
This project presents an implementation of automatic KNN Arabic text categorizer. Six hundred Arabic...
Today, text categorization is usually used in various areas, such as: information retrieval, data mi...
The quantity of text information published in Arabic language on the net requires the implementatio...
There is a huge content of Arabic text available over online that requires an organization of these ...
With growing texts of electronic documents used in many applications, a fast and accurate text class...
There is a huge content of Arabic text available over online that requires an organization of these ...
Abstract-Document categorization is an important topic that is central to many applications that dem...
Dimensionality reduction is an essential task for many large-scale information processing problems s...
In recent years, a lot of algorithms have been proposed for the classification of the documents. Mos...
Text Categorization is a technique for assigning documents based on their contents to one or more pr...
International audienceThis paper focuses on studying topic identificationfor Arabic language by usin...
Preprocessing is one of the main components in a conventional document categorization (DC) framework...
Preprocessing is one of the main components in a conventional document categorization (DC) framework...
Abstract: Compared to other languages, there is still a limited body of research which has been cond...