In this paper, we introduce a new collection selection strategy to be operated in search engines with document partitioned indexes. Our method involves the selection of those document partitions that are most likely to deliver the best results to the formulated queries, reducing the number of queries that are submitted to each partition. This method employs learning algorithms that are capable of ranking the partitions, maximizing the probability of recovering documents with high gain. The method operates by building vector representations of each partition on the term space that is spanned by the queries. The proposed method is able to generalize to new queries and elaborate document lists with high precision for queries not considered dur...
A publicly available dataset for federated search reflecting a real web environment has long been bs...
Careful architectural decisions are required in order to create a highly available and scalable sear...
Text search engines return a set of k documents ranked by similarity to a query. Typically, document...
Large-scale Parallel Web Search Engines (WSEs) needs to adopt a strategy for partitioning the invert...
Large-scale Parallel Web Search Engines (WSEs) needs to adopt a strategy for partitioning the invert...
Abstract — We present a novel strategy to partition a document collection onto several servers and t...
In this thesis, we present a distributed architecture for a Web search engine, based on the concept ...
This article introduces an architecture for a document-partitioned search engine, based on a novel a...
Large web search engines process billions of queries each day over tens of billions of documents wit...
Web search engines have to deal with a rapidly increasing amount of information, high query loads an...
Abstract—To address the rapid growth of the Internet, modern Web search engines have to adopt distri...
As the number of electronic data collections available on the internet increases, so does the diffic...
To address the rapid growth of the Internet, moder Web search engines have to adopt distributed orga...
The Web is comprised of a vast quantity of text. Modern search engines struggle to index it independ...
Traditional document classification frameworks, which apply the learned classifier to each document ...
A publicly available dataset for federated search reflecting a real web environment has long been bs...
Careful architectural decisions are required in order to create a highly available and scalable sear...
Text search engines return a set of k documents ranked by similarity to a query. Typically, document...
Large-scale Parallel Web Search Engines (WSEs) needs to adopt a strategy for partitioning the invert...
Large-scale Parallel Web Search Engines (WSEs) needs to adopt a strategy for partitioning the invert...
Abstract — We present a novel strategy to partition a document collection onto several servers and t...
In this thesis, we present a distributed architecture for a Web search engine, based on the concept ...
This article introduces an architecture for a document-partitioned search engine, based on a novel a...
Large web search engines process billions of queries each day over tens of billions of documents wit...
Web search engines have to deal with a rapidly increasing amount of information, high query loads an...
Abstract—To address the rapid growth of the Internet, modern Web search engines have to adopt distri...
As the number of electronic data collections available on the internet increases, so does the diffic...
To address the rapid growth of the Internet, moder Web search engines have to adopt distributed orga...
The Web is comprised of a vast quantity of text. Modern search engines struggle to index it independ...
Traditional document classification frameworks, which apply the learned classifier to each document ...
A publicly available dataset for federated search reflecting a real web environment has long been bs...
Careful architectural decisions are required in order to create a highly available and scalable sear...
Text search engines return a set of k documents ranked by similarity to a query. Typically, document...