Are two adjectives worth the same as a single noun when documents are ordered based on decreasing topicality? We propose an easy to interpret single number Relative Feature Utility (RFU) measure of the relative worth of using specific linguistic or non-linguistic features or sets of features in computational systems that order or filter media, such as information retrieval and classification systems. This measure allows one to make easily interpreted claims about the relative utility of features such as parts-of-speech, term suffixes, phrases vs. single terms, annotations, hyperlinks, citations, index terms, and metadata when ordering natural language text or other media. Data is provided for the RFU for stemming characteristics, ...
The selection and identification of terms is an important part of many natural language applications...
One of the most relevant problems with Information Retrieval (IR) software is the correct processing...
A number of content management tasks, including term categorization, term clustering, and automated ...
Are two adjectives worth the same as a single noun when documents are ordered based on decreasing to...
The use of natural language information can improve decision-making. Darwinian considerations sugges...
This paper follows a formal approach to information retrieval based on statistical language models. ...
each feature's positive or negative contribution to We present a comparative analysis of the pe...
Although relevance is known to be a multidimensional concept, information retrieval measures mainly ...
With the emergence of vast resources of information, it is necessary to develop methods that retriev...
Evaluation of natural language processing tools and systems must focus on two complementary aspects:...
Supervised text categorization is a machine learning task where a predefined category label is autom...
This paper discusses research on distinguishing word meanings in the context of information retrieva...
In this paper, a natural language approach to Information Retrieval (IR) and Information Filtering (...
In this paper, we describe a first in a series of experiments for determining the useful-ness of sta...
Automatic language processing tools typically assign to terms so-called 'weights' correspo...
The selection and identification of terms is an important part of many natural language applications...
One of the most relevant problems with Information Retrieval (IR) software is the correct processing...
A number of content management tasks, including term categorization, term clustering, and automated ...
Are two adjectives worth the same as a single noun when documents are ordered based on decreasing to...
The use of natural language information can improve decision-making. Darwinian considerations sugges...
This paper follows a formal approach to information retrieval based on statistical language models. ...
each feature's positive or negative contribution to We present a comparative analysis of the pe...
Although relevance is known to be a multidimensional concept, information retrieval measures mainly ...
With the emergence of vast resources of information, it is necessary to develop methods that retriev...
Evaluation of natural language processing tools and systems must focus on two complementary aspects:...
Supervised text categorization is a machine learning task where a predefined category label is autom...
This paper discusses research on distinguishing word meanings in the context of information retrieva...
In this paper, a natural language approach to Information Retrieval (IR) and Information Filtering (...
In this paper, we describe a first in a series of experiments for determining the useful-ness of sta...
Automatic language processing tools typically assign to terms so-called 'weights' correspo...
The selection and identification of terms is an important part of many natural language applications...
One of the most relevant problems with Information Retrieval (IR) software is the correct processing...
A number of content management tasks, including term categorization, term clustering, and automated ...