Dimensionality reduction is an essential task for many large-scale information processing problems such as classifying document sets, searching over Web data sets, etc. It can be used to improve both the efficiency and the effectiveness of classifiers. In this paper, a comparative study is conducted of five Dimension Reduction Techniques in the context of the Arabic text classification problem using an in house Arabic dataset. We evaluated and compared Stemming, Light-Stemming, Document Frequency (DF), TFIDF and Latent Semantic Indexing (LSI)methods to reduce the feature space into an input space of much lower dimension for the neural network classifier. The results showed that the proposed model was able to achieve high categorization effe...
Feature selection problem is one of the main important problems in the text and data mining domain. ...
With the tremendous amount of electronic documents available, there is a great need to classify docu...
Text Categorization (classification) is the process of classifying documents into a predefined set o...
Feature reduction methods have been successfully applied to text categorization. In this paper, we p...
In this paper, we present a model based on the Neural Network (NN) for classifying Arabic texts. We ...
Cosine similarity is one of the most popular distance measures in text classification problems. In t...
Abstract—Feature selection is necessary for effective text classification. Dataset preprocessing is ...
Classifying or categorizing texts is the process by which documents are classified into groups by su...
This paper compares and contrasts two feature selection techniques when applied to Arabic corpus; in...
Due to the increased demand for automatic document organization, text classification is essential in...
Text Categorization is a technique for assigning documents based on their contents to one or more pr...
Today, text categorization is usually used in various areas, such as: information retrieval, data mi...
Text classification is the task of assigning a document to one or more of pre-defined categories bas...
Abstract-Document categorization is an important topic that is central to many applications that dem...
Abstract. The Arabic language is a highly flexional and morphologically very rich language. It prese...
Feature selection problem is one of the main important problems in the text and data mining domain. ...
With the tremendous amount of electronic documents available, there is a great need to classify docu...
Text Categorization (classification) is the process of classifying documents into a predefined set o...
Feature reduction methods have been successfully applied to text categorization. In this paper, we p...
In this paper, we present a model based on the Neural Network (NN) for classifying Arabic texts. We ...
Cosine similarity is one of the most popular distance measures in text classification problems. In t...
Abstract—Feature selection is necessary for effective text classification. Dataset preprocessing is ...
Classifying or categorizing texts is the process by which documents are classified into groups by su...
This paper compares and contrasts two feature selection techniques when applied to Arabic corpus; in...
Due to the increased demand for automatic document organization, text classification is essential in...
Text Categorization is a technique for assigning documents based on their contents to one or more pr...
Today, text categorization is usually used in various areas, such as: information retrieval, data mi...
Text classification is the task of assigning a document to one or more of pre-defined categories bas...
Abstract-Document categorization is an important topic that is central to many applications that dem...
Abstract. The Arabic language is a highly flexional and morphologically very rich language. It prese...
Feature selection problem is one of the main important problems in the text and data mining domain. ...
With the tremendous amount of electronic documents available, there is a great need to classify docu...
Text Categorization (classification) is the process of classifying documents into a predefined set o...