International audienceWhile high-dimensional search-by-similarity techniques reached their maturity and in overall provide good performance, most of them are unable to cope with very large multimedia collections. The 'big data' challenge however has to be addressed as multimedia collections have been explosively growing and will grow even faster than ever within the next few years. Luckily, computational processing power has become more available to researchers due to easier access to distributed grid infrastructures. In this paper, we show how high-dimensional indexing methods can be used on scientific grid environments and present a scalable workflow for indexing and searching over 30 billion SIFT descriptors using a cluster running Hadoo...
Access methods are a fundamental tool on Information Retrieval. However, most of these methods suff...
In Information Retrieval (IR) the efficient strategy of indexing large dataset and terabyte-scale ...
The notorious iodimensionality curseln is a well-known phenomenon for any multi-dimensional indexes ...
International audienceWhile high-dimensional search-by-similarity techniques reached their maturity ...
International audienceMost researchers working on high-dimensional indexing agree on the following t...
International audienceThis paper presents an initial study where the creation of a high-dimensional ...
The scale of multimedia collections has grown very fast over the last few years. Facebook stores mor...
Indexing high dimensional data has its utility in many real world applications. Especially the infor...
Scientific data analysis applications require large scale computing power to effectively service cli...
In the last years Hadoop has been used as a standard backend for big data applications. Its most kno...
Abstract—Similarity search is critical for many database ap-plications, including the increasingly p...
Scientific datasets are often stored on distributed archival storage systems, because geographically...
While declustering methods for distributed multidimensional indexing of large datasets have been res...
The creation of very large-scale multimedia search engines, with more than one billion images and v...
Efficient response to search queries is very crucial for data analysts to obtain timely results from...
Access methods are a fundamental tool on Information Retrieval. However, most of these methods suff...
In Information Retrieval (IR) the efficient strategy of indexing large dataset and terabyte-scale ...
The notorious iodimensionality curseln is a well-known phenomenon for any multi-dimensional indexes ...
International audienceWhile high-dimensional search-by-similarity techniques reached their maturity ...
International audienceMost researchers working on high-dimensional indexing agree on the following t...
International audienceThis paper presents an initial study where the creation of a high-dimensional ...
The scale of multimedia collections has grown very fast over the last few years. Facebook stores mor...
Indexing high dimensional data has its utility in many real world applications. Especially the infor...
Scientific data analysis applications require large scale computing power to effectively service cli...
In the last years Hadoop has been used as a standard backend for big data applications. Its most kno...
Abstract—Similarity search is critical for many database ap-plications, including the increasingly p...
Scientific datasets are often stored on distributed archival storage systems, because geographically...
While declustering methods for distributed multidimensional indexing of large datasets have been res...
The creation of very large-scale multimedia search engines, with more than one billion images and v...
Efficient response to search queries is very crucial for data analysts to obtain timely results from...
Access methods are a fundamental tool on Information Retrieval. However, most of these methods suff...
In Information Retrieval (IR) the efficient strategy of indexing large dataset and terabyte-scale ...
The notorious iodimensionality curseln is a well-known phenomenon for any multi-dimensional indexes ...