Adapting the Naive Bayes classifier to rank procedural texts

Yin, Ling
Power, Richard

Open link

Publication date

March 2006

DOI

10.1007/11735106_17

Publisher

Springer Science and Business Media LLC

Language

English

Abstract

This paper presents a machine-learning approach for ranking web documents according to the proportion of procedural text they contain. By 'pro-cedural text' we refer to ordered lists of steps, which are very common in some instructional genres such as online manuals. Our initial training corpus is built up by applying some simple heuristics to select documents from a large collection and contains only a few documents with a large proportion of procedural texts. We adapt the Naive Bayes classifier to better fit this less than ideal training corpus. This adapted model is compared with several other classifiers in ranking procedural texts using different sets of features and is shown to perform well when only highly distinctive features are us...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Adapting the Naive Bayes classifier to rank procedural texts

Abstract

Extracted data

Adapting the Naive Bayes classifier to rank procedural texts

Abstract

Extracted data

Related items

Related items