This work fulfills sublinear time Near-est Neighbor Search (NNS) in massive-scale document collections. The primary contribution is to propose a two-stage unsupervised hashing framework which harmoniously integrates two state-of-the-art hashing algorithms Locality Sensitive Hashing (LSH) and Iterative Quantization (ITQ). LSH accounts for neighbor candi-date pruning, while ITQ provides an ef-ficient and effective reranking over the neighbor pool captured by LSH. Further-more, the proposed hashing framework capitalizes on both term and topic similar-ity among documents, leading to precise document retrieval. The experimental re-sults convincingly show that our hashing based document retrieval approach well approximates the conventional Inform...
Locality Sensitive Hashing (LSH) is one of the most promising techniques for solving nearest neighbo...
We propose a novel hashing-based matching scheme, called Locally Optimized Hashing (LOH), based on ...
Recently, hashing based Approximate Nearest Neighbor (ANN) techniques have been attracting lots of a...
Abstract—Nearest Neighbor Search for similar document retrieval suffers from an efficiency problem w...
Efficient high-dimensional similarity search structures are essential for building scalable content-...
A promising way to accelerate similarity search is semantic hashing which designs compact binary cod...
Hashing has been widely researched to solve the large-scale approximate nearest neighbor search prob...
International audienceIn this paper, we have presented a new and faster word retrieval approach, whi...
AbstractWe show how to learn a deep graphical model of the word-count vectors obtained from a large ...
International audienceIt is well known that high-dimensional nearest-neighbor retrieval is very expe...
Finding nearest neighbors has become an important operation on databases, with applications to text ...
This thesis studies document signatures, which are small representations of documents and other obje...
Numerous applications in search, databases, machine learning, and computer vision, can benefit from...
k-nearest neighbor (k-NN) search aims at nding k points nearest to a query point in a given datase...
The query complexity of locality sensitive hash-ing (LSH) based similarity search is dominated by th...
Locality Sensitive Hashing (LSH) is one of the most promising techniques for solving nearest neighbo...
We propose a novel hashing-based matching scheme, called Locally Optimized Hashing (LOH), based on ...
Recently, hashing based Approximate Nearest Neighbor (ANN) techniques have been attracting lots of a...
Abstract—Nearest Neighbor Search for similar document retrieval suffers from an efficiency problem w...
Efficient high-dimensional similarity search structures are essential for building scalable content-...
A promising way to accelerate similarity search is semantic hashing which designs compact binary cod...
Hashing has been widely researched to solve the large-scale approximate nearest neighbor search prob...
International audienceIn this paper, we have presented a new and faster word retrieval approach, whi...
AbstractWe show how to learn a deep graphical model of the word-count vectors obtained from a large ...
International audienceIt is well known that high-dimensional nearest-neighbor retrieval is very expe...
Finding nearest neighbors has become an important operation on databases, with applications to text ...
This thesis studies document signatures, which are small representations of documents and other obje...
Numerous applications in search, databases, machine learning, and computer vision, can benefit from...
k-nearest neighbor (k-NN) search aims at nding k points nearest to a query point in a given datase...
The query complexity of locality sensitive hash-ing (LSH) based similarity search is dominated by th...
Locality Sensitive Hashing (LSH) is one of the most promising techniques for solving nearest neighbo...
We propose a novel hashing-based matching scheme, called Locally Optimized Hashing (LOH), based on ...
Recently, hashing based Approximate Nearest Neighbor (ANN) techniques have been attracting lots of a...