Text categorization using compression models

Frank, Eibe
Chui, Chang
Witten, Ian H.

Open link

Publication date

January 2000

Publisher

University of Waikato, Department of Computer Science

Journal

issn:1170-487X

Abstract

Text categorization, or the assignment of natural language texts to predefined categories based on their content, is of growing importance as the volume of information available on the internet continues to overwhelm us. The use of predefined categories implies a “supervised learning” approach to categorization, where already-classified articles which effectively define the categories are used as “training data” to build a model that can be used for classifying new articles that comprise the “test data”. This contrasts with “unsupervised” learning, where there is no training data and clusters of like documents are sought amongst the test articles. With supervised learning, meaningful labels (such as keyphrases) are attached to the training ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Text categorization using compression models

Abstract

Extracted data

Text categorization using compression models

Abstract

Extracted data

Topics

Related items

Topics

Related items