Scalable similarity search on images, documents, and user activities benefits generic search, data visualization, and recommendation systems. This thesis concerns the design of algorithms and machine learning tools for faster and more accurate similarity search. The proposed techniques advocate the use of discrete codes for representing the similarity structure of data in a compact way. In particular, we will discuss how one can learn to map high-dimensional data onto binary codes with a metric learning approach. Then, we will describe a simple algorithm for fast exact nearest neighbor search in Hamming distance, which exhibits sub-linear query time performance. Going beyond binary codes, we will highlight a compositional generalization of ...
Similarity search is a key problem in many real world applications including image and text retrieva...
The nearest- or near-neighbor query problems arise in a large variety of database applications, usua...
Metric databases are databases where a metric distance function is defined for pairs of database obj...
An important characteristic of the recent decade is the dramatic growth in the use and generation of...
Similarity search problems in high-dimensional data arise in many areas of computer science such as ...
This thesis studies the scalability of the similarity search problem in large-scale multidimensional...
We consider approaches for exact similarity search in a high dimensional space of correlated feature...
Research Doctorate - Doctor of Philosophy (PhD)This thesis presents techniques for accelerating simi...
As databases increasingly integrate different types of information such as time-series, multimedia a...
Similarity search is the basis for many data analytics techniques, including k-nearest neighbor clas...
Similarity search is a task fundamental to many machine learning and data analytics applications, wh...
Abstract Algorithms to rapidly search massive image or video collections are crit-ical for many visi...
Image similarity search is a fundamental problem in computer vision. Efficient similarity search acr...
Similarity search is important for many data-intensive applications to identify a set of similar obj...
First we consider pair-wise distances for literal objects consisting of finite binary files. These f...
Similarity search is a key problem in many real world applications including image and text retrieva...
The nearest- or near-neighbor query problems arise in a large variety of database applications, usua...
Metric databases are databases where a metric distance function is defined for pairs of database obj...
An important characteristic of the recent decade is the dramatic growth in the use and generation of...
Similarity search problems in high-dimensional data arise in many areas of computer science such as ...
This thesis studies the scalability of the similarity search problem in large-scale multidimensional...
We consider approaches for exact similarity search in a high dimensional space of correlated feature...
Research Doctorate - Doctor of Philosophy (PhD)This thesis presents techniques for accelerating simi...
As databases increasingly integrate different types of information such as time-series, multimedia a...
Similarity search is the basis for many data analytics techniques, including k-nearest neighbor clas...
Similarity search is a task fundamental to many machine learning and data analytics applications, wh...
Abstract Algorithms to rapidly search massive image or video collections are crit-ical for many visi...
Image similarity search is a fundamental problem in computer vision. Efficient similarity search acr...
Similarity search is important for many data-intensive applications to identify a set of similar obj...
First we consider pair-wise distances for literal objects consisting of finite binary files. These f...
Similarity search is a key problem in many real world applications including image and text retrieva...
The nearest- or near-neighbor query problems arise in a large variety of database applications, usua...
Metric databases are databases where a metric distance function is defined for pairs of database obj...