Near-replicas of Web Pages Detection Efficient Algorithm based on single MD5 fingerprint

Wang Da-zhen
Chen Yu-hui

Publication date

July 2015

Abstract

Abstract:- We consider how to efficiently compute the overlap between all pairs of web documents. This information can be used to improve web crawlers, web archives and in the presentation of search results, among others. Our experiments show that how common replication is on the web, and testified that our algorithm is better than others

Extracted data

We use cookies to provide a better user experience.

Data Protection

Near-replicas of Web Pages Detection Efficient Algorithm based on single MD5 fingerprint

Abstract

Extracted data

Near-replicas of Web Pages Detection Efficient Algorithm based on single MD5 fingerprint

Abstract

Extracted data

Related items

Related items