Due to the skewed nature of the frequency distribution of term occurrence (e.g., Zipf’s law) it is unlikely that any single technique for indexing text can do well in all situations. In this paper we propose a hybrid approach to indexing text? and show how it can outperform the traditional inverted B-tree index both in storage overhead, in time to perform a retrieval, and, for dynamic databases, in time for an insertion, both for single term and for multiple term queries. We demonstrate the benefits of our technique on a database of stories from the Associated Press news wire, and we provide formulae and guidelines on how to make optimal choices of the design parameters in real applications. 1
In this chapter we describe a set of index structures that are suitable for supporting search querie...
With the proliferation of the world's ``information highways'' a renewed interest in efficient docum...
Intersecting inverted indexes is a fundamental operation for many applications in information retrie...
Due to the skewed nature of the frequency distribution of term occurrence (e.g., Zipf's law) it is u...
The technology underlying text search engines has advanced dramatically in the past decade. The deve...
Efficient construction of inverted indexes is essential to provision of search over large collection...
Inverted index structures are a core element of current text retrieval systems. They can be construc...
Retrieval effectiveness depends on how terms are extracted and indexed. For Chinese text (and others...
In-place and merge-based index maintenance are the two main competing strategies for on-line index ...
For free-text search over rapidly evolving corpora, dynamic update of inverted indices is a basic re...
Query processing with precomputed term pair lists can improve efficiency for some queries, but suff...
Abstract: Full-text database systems require an in-dex to allow fast access to documents based on th...
: An inverted index stores, for each term that appears in a collection of documents, a list of docum...
This thesis describes the development and setup of hybrid index structures. They are access methods ...
The inverted index supports efficient full-text searches on natural language text collections. It re...
In this chapter we describe a set of index structures that are suitable for supporting search querie...
With the proliferation of the world's ``information highways'' a renewed interest in efficient docum...
Intersecting inverted indexes is a fundamental operation for many applications in information retrie...
Due to the skewed nature of the frequency distribution of term occurrence (e.g., Zipf's law) it is u...
The technology underlying text search engines has advanced dramatically in the past decade. The deve...
Efficient construction of inverted indexes is essential to provision of search over large collection...
Inverted index structures are a core element of current text retrieval systems. They can be construc...
Retrieval effectiveness depends on how terms are extracted and indexed. For Chinese text (and others...
In-place and merge-based index maintenance are the two main competing strategies for on-line index ...
For free-text search over rapidly evolving corpora, dynamic update of inverted indices is a basic re...
Query processing with precomputed term pair lists can improve efficiency for some queries, but suff...
Abstract: Full-text database systems require an in-dex to allow fast access to documents based on th...
: An inverted index stores, for each term that appears in a collection of documents, a list of docum...
This thesis describes the development and setup of hybrid index structures. They are access methods ...
The inverted index supports efficient full-text searches on natural language text collections. It re...
In this chapter we describe a set of index structures that are suitable for supporting search querie...
With the proliferation of the world's ``information highways'' a renewed interest in efficient docum...
Intersecting inverted indexes is a fundamental operation for many applications in information retrie...