Many information retrieval systems provides access to abstracts. For example Stanford University, through its FOLIO system, provides access to the INSPEC database of abstracts of the literature on physics, computer science, electrical engineering, etc. In this article this database is studied by using a trace-driven simulation. We focus on a physical index design which accommodates truncations, inverted index caching, and database scaling in a distributed shared-nothing system. All three issues are shown to have a strong effect on response time and throughput. Database scaling is explored in two ways. One way assumes an ``optimal'' configuration for a single host and then linearly scales the database by duplicating the host architecture as ...
Information explosion across the Internet and elsewhere offers access to an increasing number of doc...
Simulation and analysis have shown that selective search can reduce the cost of large-scale distribu...
Information retrieval systems often have to deal with very large amounts of data. They must be able ...
A common class of existing information retrieval system provides access to abstracts. For example St...
The major emphasis of this paper is on analytical techniques for predicting the performance of vario...
The amount of information available over the Internet is increasing daily as well as the importance ...
Large document collections are increasingly available over the network. In order for users to access...
Abstract. Complex and data-intensive database queries mandate parallel processing strategies to achi...
In a shared-nothing, distributed text retrieval system, queries are processed over an inverted index...
Large-scale web and text retrieval systems deal with amounts of data that greatly exceed the capacit...
As information explodes across the Internet and intranets, information retrieval (IR) systems must c...
The proliferation of the world's \information highways " has renewed interest in e cie...
Simulation and analysis have shown that selective search can reduce the cost of large-scale distribu...
We identify crucial design issues in building a distributed inverted index for a large collection of...
Information explosion across the Internet and elsewhere of-fers access to an increasing number of do...
Information explosion across the Internet and elsewhere offers access to an increasing number of doc...
Simulation and analysis have shown that selective search can reduce the cost of large-scale distribu...
Information retrieval systems often have to deal with very large amounts of data. They must be able ...
A common class of existing information retrieval system provides access to abstracts. For example St...
The major emphasis of this paper is on analytical techniques for predicting the performance of vario...
The amount of information available over the Internet is increasing daily as well as the importance ...
Large document collections are increasingly available over the network. In order for users to access...
Abstract. Complex and data-intensive database queries mandate parallel processing strategies to achi...
In a shared-nothing, distributed text retrieval system, queries are processed over an inverted index...
Large-scale web and text retrieval systems deal with amounts of data that greatly exceed the capacit...
As information explodes across the Internet and intranets, information retrieval (IR) systems must c...
The proliferation of the world's \information highways " has renewed interest in e cie...
Simulation and analysis have shown that selective search can reduce the cost of large-scale distribu...
We identify crucial design issues in building a distributed inverted index for a large collection of...
Information explosion across the Internet and elsewhere of-fers access to an increasing number of do...
Information explosion across the Internet and elsewhere offers access to an increasing number of doc...
Simulation and analysis have shown that selective search can reduce the cost of large-scale distribu...
Information retrieval systems often have to deal with very large amounts of data. They must be able ...