Feature selection (FS) is a widely used method for removing redundant or irrelevant features to improve classification accuracy and decrease the model’s computational cost. In this paper, we present an improved method (referred to hereafter as RARF) for Arabic text classification (ATC) that employs the term frequency-inverse document frequency (TF-IDF) and Word2Vec embedding technique to identify words that have a particular semantic relationship. In addition, we have compared our method with four benchmark FS methods namely principal component analysis (PCA), linear discriminant analysis (LDA), chi-square, and mutual information (MI). Support vector machine (SVM), k-nearest neighbors (K-NN), and naive Bayes (NB) are three machine learning ...
International audienceThere have been great improvements in web technology over the past years which...
There is a huge content of Arabic text available over online that requires an organization of these ...
Feature selection is one of the famous solutions to reduce high dimensionality problem of text categ...
Feature selection (FS) is a widely used method for removing redundant or irrelevant features to impr...
Automated document classification is an important text mining task especially with the rapid growth ...
Text classification (TC) concerns automatically assigning a class (category) label to a text docume...
Abstract—Feature selection is necessary for effective text classification. Dataset preprocessing is ...
Abstract: Compared to other languages, there is still a limited body of research which has been cond...
Text classification is a very important area ininformation retrieval. Text classificationtechniques ...
Feature selection problem is one of the main important problems in the text and data mining domain. ...
Text classification (TC) is the process of classifying documents into a predefined set of categories...
Today, text categorization is usually used in various areas, such as: information retrieval, data mi...
Cosine similarity is one of the most popular distance measures in text classification problems. In t...
With growing texts of electronic documents used in many applications, a fast and accurate text class...
Recent research on Big Data proposed and evaluated a number of advanced techniques to gain meaningfu...
International audienceThere have been great improvements in web technology over the past years which...
There is a huge content of Arabic text available over online that requires an organization of these ...
Feature selection is one of the famous solutions to reduce high dimensionality problem of text categ...
Feature selection (FS) is a widely used method for removing redundant or irrelevant features to impr...
Automated document classification is an important text mining task especially with the rapid growth ...
Text classification (TC) concerns automatically assigning a class (category) label to a text docume...
Abstract—Feature selection is necessary for effective text classification. Dataset preprocessing is ...
Abstract: Compared to other languages, there is still a limited body of research which has been cond...
Text classification is a very important area ininformation retrieval. Text classificationtechniques ...
Feature selection problem is one of the main important problems in the text and data mining domain. ...
Text classification (TC) is the process of classifying documents into a predefined set of categories...
Today, text categorization is usually used in various areas, such as: information retrieval, data mi...
Cosine similarity is one of the most popular distance measures in text classification problems. In t...
With growing texts of electronic documents used in many applications, a fast and accurate text class...
Recent research on Big Data proposed and evaluated a number of advanced techniques to gain meaningfu...
International audienceThere have been great improvements in web technology over the past years which...
There is a huge content of Arabic text available over online that requires an organization of these ...
Feature selection is one of the famous solutions to reduce high dimensionality problem of text categ...