The categorization of documents is traditionally topic-based. This paper presents a complementary analysis of research and experiments on genre to show that encouraging results can be obtained by using genre structure (form) features. We conducted an experiment to assess the effectiveness of using extensible mark-up language (XML) tag information, and part-of-speech (P-O-S) features, for the classification of genres, testing the hypothesis that if a focus on genre can lead to high precision on normal textual documents, then good results can be achieved using XML tag information in addition to P-O-S information. An experiment was carried out on a subsection of the initiative for the evaluation of XML (INEX) 1.4 collection. The features were ...
This thesis treats the sociotechnical notion of genre as a conflation of a communicative situation a...
Abstract. This paper contributes to a facet from the area of Web Information Retrieval that has rece...
(postprint); This version corrects a couple of errors in authors' names in the bibliography./www.juc...
Extensible Markup Language (XML) is a simple and flexible text format derived from Standard Generali...
Genre characterizes text differently than the usual subject or prepositional content that has been t...
Genre provides a characterization of a document with respect to its form or functional trait. Genre ...
This paper examines automated genre classification of text documents and its role in enabling the ef...
Retrieving relevant documents over the Web is an over-whelming task when search engines return thous...
This thesis aims at examining to what extent a few, algorithmically very easily extractable document...
This paper offers a proposal for some preliminary research on the retrieval of structured text, such...
Abstract. Genre provides a characterization of a document with respect to its form or functional tra...
This paper examines automated genre classification of text documents and its role in enabling the ef...
We discuss the issues of resolving the information-retrieval problem in large digital collections th...
This paper reports on our approach to the analysis of genre recognition using eyetracking. We focuse...
We report on our ongoing study of using the genre of Web pages to facilitate information exploration...
This thesis treats the sociotechnical notion of genre as a conflation of a communicative situation a...
Abstract. This paper contributes to a facet from the area of Web Information Retrieval that has rece...
(postprint); This version corrects a couple of errors in authors' names in the bibliography./www.juc...
Extensible Markup Language (XML) is a simple and flexible text format derived from Standard Generali...
Genre characterizes text differently than the usual subject or prepositional content that has been t...
Genre provides a characterization of a document with respect to its form or functional trait. Genre ...
This paper examines automated genre classification of text documents and its role in enabling the ef...
Retrieving relevant documents over the Web is an over-whelming task when search engines return thous...
This thesis aims at examining to what extent a few, algorithmically very easily extractable document...
This paper offers a proposal for some preliminary research on the retrieval of structured text, such...
Abstract. Genre provides a characterization of a document with respect to its form or functional tra...
This paper examines automated genre classification of text documents and its role in enabling the ef...
We discuss the issues of resolving the information-retrieval problem in large digital collections th...
This paper reports on our approach to the analysis of genre recognition using eyetracking. We focuse...
We report on our ongoing study of using the genre of Web pages to facilitate information exploration...
This thesis treats the sociotechnical notion of genre as a conflation of a communicative situation a...
Abstract. This paper contributes to a facet from the area of Web Information Retrieval that has rece...
(postprint); This version corrects a couple of errors in authors' names in the bibliography./www.juc...