Duplicate image discovery, or discovering duplicate im-age clusters, is a challenging problem for billions of Inter-net images due to the lack of good distance metric which both covers the large variation within a duplicate image cluster and eliminates false alarms. After carefully inves-tigating existing local and global features that have been widely used for large-scale image search and indexing, we propose a two-step approach that combines both local and global features: global descriptors are used to discover seed clusters with high precision, whereas local descriptors are used to grow the seeds to cover good recall. Using effi-cient hashing techniques for both features and the MapRe-duce framework, our system is able to discover about...
Abstract—We propose a randomized data mining method that finds clusters of spatially overlapping ima...
The vast numbers of images on the Web include many duplicates, and an even larger number of near-dup...
The explosive growth of multimedia data poses serious challenges to data storage, management and sea...
International audienceThis paper addresses the problem of detecting groups of duplicates in large-sc...
International audienceThis paper addresses the problem of detecting groups of duplicates in large-sc...
Near-duplicate images introduce problems of redundancy and copyright infringement in large image col...
International audienceSocial media intelligence is interested in detecting the massive propagation o...
International audienceSocial media intelligence is interested in detecting the massive propagation o...
International audienceLarge scale duplicate detection, clustering and mining of documents or images ...
International audienceLarge scale duplicate detection, clustering and mining of documents or images ...
International audienceLarge scale duplicate detection, clustering and mining of documents or images ...
Near-duplicate images introduce problems of redundancy and copyright infringement in large image col...
Conference of 13th International Workshop on Content-Based Multimedia Indexing, CBMI 2015 ; Conferen...
Conference of 13th International Workshop on Content-Based Multimedia Indexing, CBMI 2015 ; Conferen...
Conference of 13th International Workshop on Content-Based Multimedia Indexing, CBMI 2015 ; Conferen...
Abstract—We propose a randomized data mining method that finds clusters of spatially overlapping ima...
The vast numbers of images on the Web include many duplicates, and an even larger number of near-dup...
The explosive growth of multimedia data poses serious challenges to data storage, management and sea...
International audienceThis paper addresses the problem of detecting groups of duplicates in large-sc...
International audienceThis paper addresses the problem of detecting groups of duplicates in large-sc...
Near-duplicate images introduce problems of redundancy and copyright infringement in large image col...
International audienceSocial media intelligence is interested in detecting the massive propagation o...
International audienceSocial media intelligence is interested in detecting the massive propagation o...
International audienceLarge scale duplicate detection, clustering and mining of documents or images ...
International audienceLarge scale duplicate detection, clustering and mining of documents or images ...
International audienceLarge scale duplicate detection, clustering and mining of documents or images ...
Near-duplicate images introduce problems of redundancy and copyright infringement in large image col...
Conference of 13th International Workshop on Content-Based Multimedia Indexing, CBMI 2015 ; Conferen...
Conference of 13th International Workshop on Content-Based Multimedia Indexing, CBMI 2015 ; Conferen...
Conference of 13th International Workshop on Content-Based Multimedia Indexing, CBMI 2015 ; Conferen...
Abstract—We propose a randomized data mining method that finds clusters of spatially overlapping ima...
The vast numbers of images on the Web include many duplicates, and an even larger number of near-dup...
The explosive growth of multimedia data poses serious challenges to data storage, management and sea...