Most research on automated text categorization has focused on determining the topic of a given text. While topic is generally the main characteristic of an information need, there are other characteristics that are useful for information retrieval. In this paper we consider the problem of text categorization according to style. For example, we may wish to automatically determine if a given text is taken from a magazine or a newspaper, is an editoral or a news item, is promo-tional or informative, was written by a native English speaker or not, and so on. Learning to determine the style of a document is a dual to that of determining its topic, in that those document features which cap-ture the style of a document are precisely those which ar...
This thesis aims at examining to what extent a few, algorithmically very easily extractable document...
This paper presents an in-document content classification approach that combines genre analysis and ...
Over the last years, there has been an increased interest in the combined use of natural language pr...
With the development of online data, text categorization has become one of the key procedures for ta...
Genre characterizes text differently than the usual subject or prepositional content that has been t...
In this paper we focus on, helping editors in the newspaper industry, by making their work easy by p...
In this thesis, we investigate the usefulness of a group of features in genre classification problem...
Modern Information Technologies and Web-based services are faced with the problem of selecting, filt...
The two main factors that characterize a text are its content and its style, and both can be used as...
Master of ScienceDepartment of Computer ScienceWilliam HsuThis work describes a comparative study of...
Abstract: This paper contributes to a facet from the area of Web Information Retrieval that has rece...
Text genre classification is the process of identifying functional characteristics of text documents...
Abstract. This paper contributes to a facet from the area of Web Information Retrieval that has rece...
Text categorization is the task of discovering the category or class text documents belongs to, or i...
Nowadays, the number of electronically availableinformation and knowledge from the internet israpidl...
This thesis aims at examining to what extent a few, algorithmically very easily extractable document...
This paper presents an in-document content classification approach that combines genre analysis and ...
Over the last years, there has been an increased interest in the combined use of natural language pr...
With the development of online data, text categorization has become one of the key procedures for ta...
Genre characterizes text differently than the usual subject or prepositional content that has been t...
In this paper we focus on, helping editors in the newspaper industry, by making their work easy by p...
In this thesis, we investigate the usefulness of a group of features in genre classification problem...
Modern Information Technologies and Web-based services are faced with the problem of selecting, filt...
The two main factors that characterize a text are its content and its style, and both can be used as...
Master of ScienceDepartment of Computer ScienceWilliam HsuThis work describes a comparative study of...
Abstract: This paper contributes to a facet from the area of Web Information Retrieval that has rece...
Text genre classification is the process of identifying functional characteristics of text documents...
Abstract. This paper contributes to a facet from the area of Web Information Retrieval that has rece...
Text categorization is the task of discovering the category or class text documents belongs to, or i...
Nowadays, the number of electronically availableinformation and knowledge from the internet israpidl...
This thesis aims at examining to what extent a few, algorithmically very easily extractable document...
This paper presents an in-document content classification approach that combines genre analysis and ...
Over the last years, there has been an increased interest in the combined use of natural language pr...