Abstract. This paper proposes and evaluates the use of linguistic in-formation in the pre-processing phase of text classification. We present several experiments evaluating the selection of terms based on different measures and linguistic knowledge. To build the classifier we used Sup-port Vector Machines (SVM), which are known to produce good results on text classification tasks. Our proposals were applied to two different datasets written in the Por-tuguese language: articles from a Brazilian newspaper (Folha de São Paulo) and juridical documents from the Portuguese Attorney General’s Office. The results show the relevance of part-of-speech information for the pre-processing phase of text classification allowing for a strong re-duction o...
This paper presents a comparative study of different methods for the identification of multiword exp...
In this article, we present the results as well as the procedures of a wide descriptive, corpus-base...
Portuguese juridical documents from Supreme Courts and the Attorney General’s Office are manually cl...
This paper examines the role of various linguistic structures on text classification applying the st...
Support Vector Machines have been applied to text classification with great success. In this paper, ...
Text classification is an important task in the legal domain. In fact, most of the legal information...
This dissertation proposes a set of procedures for the computational processing of Portuguese. Five ...
This dissertation proposes a set of procedures for the computational processing of Portuguese. Five ...
This paper describes two automatic systems: a linguistic features extractor and a text readability c...
This paper describes two automatic systems: a linguistic features extractor and a text readability c...
Support Vector Machines have been used successfully to classify text documents into sets of concepts...
This paper performs a study on the pre-processing phase of the automated text classification problem...
The study reports the results of the exploration of a machine-readable corpus of Brazilian Portugues...
Support Vector Machines (SVM) can classify objects described by an effectively infinite-dimensional ...
As in many other natural language processing (NLP) fields, the use of statistical methods is now par...
This paper presents a comparative study of different methods for the identification of multiword exp...
In this article, we present the results as well as the procedures of a wide descriptive, corpus-base...
Portuguese juridical documents from Supreme Courts and the Attorney General’s Office are manually cl...
This paper examines the role of various linguistic structures on text classification applying the st...
Support Vector Machines have been applied to text classification with great success. In this paper, ...
Text classification is an important task in the legal domain. In fact, most of the legal information...
This dissertation proposes a set of procedures for the computational processing of Portuguese. Five ...
This dissertation proposes a set of procedures for the computational processing of Portuguese. Five ...
This paper describes two automatic systems: a linguistic features extractor and a text readability c...
This paper describes two automatic systems: a linguistic features extractor and a text readability c...
Support Vector Machines have been used successfully to classify text documents into sets of concepts...
This paper performs a study on the pre-processing phase of the automated text classification problem...
The study reports the results of the exploration of a machine-readable corpus of Brazilian Portugues...
Support Vector Machines (SVM) can classify objects described by an effectively infinite-dimensional ...
As in many other natural language processing (NLP) fields, the use of statistical methods is now par...
This paper presents a comparative study of different methods for the identification of multiword exp...
In this article, we present the results as well as the procedures of a wide descriptive, corpus-base...
Portuguese juridical documents from Supreme Courts and the Attorney General’s Office are manually cl...