Catching web crawlers in the act

Lourenço, Anália Maria Garcia
Belo, Orlando

DOI

Publisher

Association for Computing Machinery (ACM)

Abstract

This paper recommends a new approach to the detection and containment of Web crawler traverses based on clickstream data mining. Timely detection prevents crawler abusive consumption of Web server resources and eventual site contents privacy or copyrights violation. Clickstream data differentiation ensures focused usage analysis, valuable both for regular users and crawler profiling. Our platform, named ClickTips, sustains a site-specific, updatable detection model that tags Web crawler traverses based on incremental Web session inspection and a decision model that assesses eventual containment. The goal is to deliver a model flexible enough to keep up with crawling continuous evolving and that is capable of detecting crawler presence as so...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Catching web crawlers in the act

Abstract

Extracted data

Catching web crawlers in the act

Abstract

Extracted data

Related items

Related items