THe content analysis, or indexing problem, is fundamental in information storage and retrieval. Several automatic procedures are examined for the assignment of significance values to the terms, or keywords, identifying the documents of a collection. Good and bad index terms are characterized by objective measures, leading to the conclusion that the best index terms are those with medium document frequency and skewed frequency distributions. A discrimination value model is introduced which makes it possible to construct effective indexing vocabularies by using phrase and thesaurus transformations to modify poor discriminators - those whose document frequency is too high, or too low - into better discriminators, and hence more useful...
Information Retrieval is concerned with locating information (usually text) that is relevant to a us...
Information Retrieval is concerned with locating information (usually text) that is relevant to a us...
Documents indexed with controlled vocabularies enable users of libraries to discover relevant docume...
An attempt is made to characterize the usefulness of terms occurring in stored documents and user qu...
Given a written text in natural language, it is convenient to represent the information content of t...
Abstract Index terms are an important component in considering a scientific topic. In a real sense, ...
The common view of the 'aboutness ' of documents is that the index entries (or classificat...
Most existing automatic content analysis and indexing techniques are based on word frequency charac...
Abstract Index terms are an important component in considering a scientific topic. In a real sense, ...
In this study we propose statistical models to model the indexing of textual documents by human inde...
Traditional index weighting approaches for information retrieval from texts depend on the term frequ...
Indexing in information retrieval (IR) is used to obtain a suitable vocabulary of index terms and op...
A variety of abstract automatic indexing models have been developed in recent times in an effort to...
A great many automatic indexing methods have been implemented and evaluated over the last few years...
We present an evaluation of domainindependent natural language tools for use in the identification o...
Information Retrieval is concerned with locating information (usually text) that is relevant to a us...
Information Retrieval is concerned with locating information (usually text) that is relevant to a us...
Documents indexed with controlled vocabularies enable users of libraries to discover relevant docume...
An attempt is made to characterize the usefulness of terms occurring in stored documents and user qu...
Given a written text in natural language, it is convenient to represent the information content of t...
Abstract Index terms are an important component in considering a scientific topic. In a real sense, ...
The common view of the 'aboutness ' of documents is that the index entries (or classificat...
Most existing automatic content analysis and indexing techniques are based on word frequency charac...
Abstract Index terms are an important component in considering a scientific topic. In a real sense, ...
In this study we propose statistical models to model the indexing of textual documents by human inde...
Traditional index weighting approaches for information retrieval from texts depend on the term frequ...
Indexing in information retrieval (IR) is used to obtain a suitable vocabulary of index terms and op...
A variety of abstract automatic indexing models have been developed in recent times in an effort to...
A great many automatic indexing methods have been implemented and evaluated over the last few years...
We present an evaluation of domainindependent natural language tools for use in the identification o...
Information Retrieval is concerned with locating information (usually text) that is relevant to a us...
Information Retrieval is concerned with locating information (usually text) that is relevant to a us...
Documents indexed with controlled vocabularies enable users of libraries to discover relevant docume...