AbstractSimilarity search has been proved suitable for searching in large collections of unstructured data objects. A number of practical index data structures for this purpose have been proposed. All of them have been devised to process single queries sequentially. However, in large-scale systems such as Web Search Engines indexing multi-media content, it is critical to deal efficiently with streams of queries rather than with single queries. In this paper we show how to achieve efficient and scalable performance in this context. To this end we transform a sequential index based on clustering into a distributed one and devise algorithms and optimizations specially tailored to support high-performance parallel query processing
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
AbstractSimilarity search has been proved suitable for searching in large collections of unstructure...
Due to the increasing complexity of current digital data, the similarity search has become a fundame...
Due to the increasing complexity of current digital data, similarity search has become a fundamental...
This article compares several strategies for searching in Web engines and we present the bucket alg...
Information retrieval systems often have to deal with very large amounts of data. They must be able ...
This article compares several strategies for searching in Web engines and we present the bucket alg...
Access methods are a fundamental tool on Information Retrieval. However, most of these methods suff...
The creation of very large-scale multimedia search engines, with more than one billion images and v...
The creation of very large-scale multimedia search engines, with more than one billion images and v...
The Web has became an obiquitous resource for distributed computing making it relevant to investigat...
The Web has became an obiquitous resource for distributed computing making it relevant to investigat...
I would like to thank my supervisor Pavel Zezula for guidance, insight and patience during this rese...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
AbstractSimilarity search has been proved suitable for searching in large collections of unstructure...
Due to the increasing complexity of current digital data, the similarity search has become a fundame...
Due to the increasing complexity of current digital data, similarity search has become a fundamental...
This article compares several strategies for searching in Web engines and we present the bucket alg...
Information retrieval systems often have to deal with very large amounts of data. They must be able ...
This article compares several strategies for searching in Web engines and we present the bucket alg...
Access methods are a fundamental tool on Information Retrieval. However, most of these methods suff...
The creation of very large-scale multimedia search engines, with more than one billion images and v...
The creation of very large-scale multimedia search engines, with more than one billion images and v...
The Web has became an obiquitous resource for distributed computing making it relevant to investigat...
The Web has became an obiquitous resource for distributed computing making it relevant to investigat...
I would like to thank my supervisor Pavel Zezula for guidance, insight and patience during this rese...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...