The nearest neighbor graph is an important structure in many data mining methods for clustering, advertising, recommender systems, and outlier detection. Constructing the graph requires computing up to n2 similarities for a set of n objects. This high complexity has led researchers to seek approximate methods, which find many but not all of the nearest neighbors. In contrast, we leverage shared memory parallelism and recent advances in similarity joins to solve the problem exactly. Our method considers all pairs of potential neighbors but quickly filters pairs that could not be a part of the nearest neighbor graph, based on similarity upper bound estimates. The filtering is data dependent and not easily predicted, which poses load balance ...
International audienceIn this paper, we propose an efficient KNN service, called KPS (KNN-Peer-Sampl...
We describe a recursive algorithm to quickly compute the N nearest neighbors according to a similari...
This paper presents a novel approach to perform fast approximate nearest neighbors search in high di...
The nearest neighbor graph is an important structure in many data mining methods for clustering, adv...
University of Minnesota Ph.D. dissertation.June 2016. Major: Computer Science. Advisor: George Kary...
The use of the join operator in metric spaces leads to what is known as a similarity join, where ob...
Similarity search problems in high-dimensional data arise in many areas of computer science such as ...
Thesis (Ph.D.)--University of Washington, 2018We present several foundational results on computation...
Construction of a nearest neighbor graph is often a neces- sary step in many machine learning appli...
For many computer vision and machine learning problems, large training sets are key for good perform...
Abstract-A nonparametric clustering technique incorporating the concept of similarity based on the s...
Leading machine learning techniques rely on inputs in the form of pairwise similarities between obje...
Link prediction, personalized graph search, fraud detection, and many such graph mining problems rev...
K-Nearest-Neighbor (KNN) graphs have emerged as a fundamentalbuilding block of many on-line services...
International audienceThe mining of time series data plays an important role in modern information r...
International audienceIn this paper, we propose an efficient KNN service, called KPS (KNN-Peer-Sampl...
We describe a recursive algorithm to quickly compute the N nearest neighbors according to a similari...
This paper presents a novel approach to perform fast approximate nearest neighbors search in high di...
The nearest neighbor graph is an important structure in many data mining methods for clustering, adv...
University of Minnesota Ph.D. dissertation.June 2016. Major: Computer Science. Advisor: George Kary...
The use of the join operator in metric spaces leads to what is known as a similarity join, where ob...
Similarity search problems in high-dimensional data arise in many areas of computer science such as ...
Thesis (Ph.D.)--University of Washington, 2018We present several foundational results on computation...
Construction of a nearest neighbor graph is often a neces- sary step in many machine learning appli...
For many computer vision and machine learning problems, large training sets are key for good perform...
Abstract-A nonparametric clustering technique incorporating the concept of similarity based on the s...
Leading machine learning techniques rely on inputs in the form of pairwise similarities between obje...
Link prediction, personalized graph search, fraud detection, and many such graph mining problems rev...
K-Nearest-Neighbor (KNN) graphs have emerged as a fundamentalbuilding block of many on-line services...
International audienceThe mining of time series data plays an important role in modern information r...
International audienceIn this paper, we propose an efficient KNN service, called KPS (KNN-Peer-Sampl...
We describe a recursive algorithm to quickly compute the N nearest neighbors according to a similari...
This paper presents a novel approach to perform fast approximate nearest neighbors search in high di...