This paper describes an algorithm for document representation in a reduced vectorial space by a process of feature extraction. The algorithm is evaluated in the context of the supervised classification of news articles. We are generating a document representation (profile) represented by semantic tags from a machine-readable dictionary. We are dealing with synonymy handled by thematic conflation, and polysemy for which we have developed a statistical method for word-sense disambiguation. We propose four variants for the profile generation depending on whether a recursive system is used or not, and whether a corrective factor for polysemous words is taken into account or not. We have evaluated 32 variants, depending on the algorithm type and...
This paper presents an extension of prior work by Michael D. Lee on psychologically plausible text c...
This paper investigates the problem of text classification. The task of text classification is to as...
For processing the textual data using statistical methods like Machine Learning (ML), the data often...
This paper describes an algorithm for document representation in a reduced vectorial space by a proc...
Exploiting multimedia documents leads to representation problems of the textual and visual informati...
Automatic text classification is the process of automatically classifying text documents into pre-de...
The bag-of-words (BOW) model is the common approach for classifying documents, where words are used ...
This thesis follows up text categorization. In the first part are described several chosen algorithm...
Dimensionality reduction (DR) through feature extraction (FE) is desirable for efficient and effecti...
We have work for a long time on the classification of text. Early on, many documents of different ty...
In this paper we will present a new approach regarding the documents representation in order to be u...
This paper focuses on the problem of choosing a representation of documents that can be suitable to ...
The problem of document classification based on their semantic content (text categorization) arises ...
This paper presents an extension of prior work by Michael D. Lee on psychologically plausible text c...
In this paper we perform a comparative analysis of three models for a feature representation of text...
This paper presents an extension of prior work by Michael D. Lee on psychologically plausible text c...
This paper investigates the problem of text classification. The task of text classification is to as...
For processing the textual data using statistical methods like Machine Learning (ML), the data often...
This paper describes an algorithm for document representation in a reduced vectorial space by a proc...
Exploiting multimedia documents leads to representation problems of the textual and visual informati...
Automatic text classification is the process of automatically classifying text documents into pre-de...
The bag-of-words (BOW) model is the common approach for classifying documents, where words are used ...
This thesis follows up text categorization. In the first part are described several chosen algorithm...
Dimensionality reduction (DR) through feature extraction (FE) is desirable for efficient and effecti...
We have work for a long time on the classification of text. Early on, many documents of different ty...
In this paper we will present a new approach regarding the documents representation in order to be u...
This paper focuses on the problem of choosing a representation of documents that can be suitable to ...
The problem of document classification based on their semantic content (text categorization) arises ...
This paper presents an extension of prior work by Michael D. Lee on psychologically plausible text c...
In this paper we perform a comparative analysis of three models for a feature representation of text...
This paper presents an extension of prior work by Michael D. Lee on psychologically plausible text c...
This paper investigates the problem of text classification. The task of text classification is to as...
For processing the textual data using statistical methods like Machine Learning (ML), the data often...