As information explodes across the Internet and intranets, information retrieval (IR) systems must cope with the challenge of scale. How to provide scalable performance for rapidly increasing data and workloads is critical in the design of next generation information retrieval systems. This dissertation studies scalable distributed IR architectures that not only provide quick response but also maintain acceptable retrieval accuracy. Our distributed architectures exploit parallelism in information retrieval on a cluster of parallel IR servers using symmetric multiprocessors, and use partial collection replication and selection as well as collection selection to restrict the search to a small percentage of data while maintaining retrieval acc...
The amount of information available over the Internet is increasing daily as well as the importance ...
Our current concern is a scalable infrastructure for information retrieval (IR) with up-to-date retr...
Server selection is typically defined as maximizing network performance under the assumption that ea...
The explosion of content in distributed information retrieval (IR) systems requires new mechanisms i...
Providing timely access to text collections both locally and across the Internet is instrumental in ...
Information explosion across the Internet and elsewhere offers access to an increasing number of doc...
Information explosion across the Internet and elsewhere offers access to an increasing number of doc...
The explosion of content in distributed information retrieval (IR) systems requires new mechanisms t...
Information retrieval systems often have to deal with very large amounts of data. They must be able ...
Abstract The explosion of content in distributed infer-marion retrieval (IR) systems requires new me...
Information explosion across the Internet and elsewhere of-fers access to an increasing number of do...
Information explosion across the Internet and elsewhere offers access to an increasing number of doc...
Large document collections are increasingly available over the network. In order for users to access...
Large-scale web and text retrieval systems deal with amounts of data that greatly exceed the capacit...
Large document collections are increasingly available over the network. In order for users to access...
The amount of information available over the Internet is increasing daily as well as the importance ...
Our current concern is a scalable infrastructure for information retrieval (IR) with up-to-date retr...
Server selection is typically defined as maximizing network performance under the assumption that ea...
The explosion of content in distributed information retrieval (IR) systems requires new mechanisms i...
Providing timely access to text collections both locally and across the Internet is instrumental in ...
Information explosion across the Internet and elsewhere offers access to an increasing number of doc...
Information explosion across the Internet and elsewhere offers access to an increasing number of doc...
The explosion of content in distributed information retrieval (IR) systems requires new mechanisms t...
Information retrieval systems often have to deal with very large amounts of data. They must be able ...
Abstract The explosion of content in distributed infer-marion retrieval (IR) systems requires new me...
Information explosion across the Internet and elsewhere of-fers access to an increasing number of do...
Information explosion across the Internet and elsewhere offers access to an increasing number of doc...
Large document collections are increasingly available over the network. In order for users to access...
Large-scale web and text retrieval systems deal with amounts of data that greatly exceed the capacit...
Large document collections are increasingly available over the network. In order for users to access...
The amount of information available over the Internet is increasing daily as well as the importance ...
Our current concern is a scalable infrastructure for information retrieval (IR) with up-to-date retr...
Server selection is typically defined as maximizing network performance under the assumption that ea...