Purpose -- This paper investigates the impact and techniques for mitigating the effects of web robots on usage statistics collected by Open Access institutional repositories (IRs). Design/methodology/approach -- A review of the literature provides a comprehensive list of web robot detection techniques. Reviews of system documentation and open source code are carried out along with personal interviews to provide a comparison of the robot detection techniques used in the major IR platforms. An empirical test based on a simple random sample of downloads with 96.20% certainty is undertaken to measure the accuracy of an IR's web robot detection at a large Irish University. Findings -- While web robot detection is not ignored in IRs, there are ar...
The article deals with a study of web-crawler behaviour on different websites. A classification of w...
Open repositories are open to EVERYONE. Unfortunately, that includes machines, or robots / “bots”, r...
Human nature is greedy to follow less effort heuristics in seeking of scientific literature. Despite...
Most modern Web robots that crawl the Internet to support value-added services and technologies poss...
9th International Conference on Qualitative and Quantitative Methods in Libraries (QQML2017), Limeri...
A significant proportion of Web traffic is now attributed to Web robots, and this proportion is like...
It has been traditionally believed that humans, who exhibit well-studied behaviors and statistical r...
Search engines largely rely on robots (i.e., crawlers or spiders) to collect information from the We...
This is a second signal-detection analysis of the accuracy of a robot in detecting open access (OA) ...
Sophisticated Web robots sport a wide variety of functionality and visiting characteristics, constit...
This paper presents a study on whether the heavy-tailed trends reported in Web traffic are present i...
The accurate detection of Web robot sessions from a web server log is essential to take accurate tra...
Antelman et al. (2005) hand-tested the accuracy of the algorithm that Hajjem et al.'s (2005) softwar...
This paper examines the use of "Robot Exclusion Protocol" to restrict the access of search engine ro...
To identify robots and human users in web archives, we conducted a study using the access logs from ...
The article deals with a study of web-crawler behaviour on different websites. A classification of w...
Open repositories are open to EVERYONE. Unfortunately, that includes machines, or robots / “bots”, r...
Human nature is greedy to follow less effort heuristics in seeking of scientific literature. Despite...
Most modern Web robots that crawl the Internet to support value-added services and technologies poss...
9th International Conference on Qualitative and Quantitative Methods in Libraries (QQML2017), Limeri...
A significant proportion of Web traffic is now attributed to Web robots, and this proportion is like...
It has been traditionally believed that humans, who exhibit well-studied behaviors and statistical r...
Search engines largely rely on robots (i.e., crawlers or spiders) to collect information from the We...
This is a second signal-detection analysis of the accuracy of a robot in detecting open access (OA) ...
Sophisticated Web robots sport a wide variety of functionality and visiting characteristics, constit...
This paper presents a study on whether the heavy-tailed trends reported in Web traffic are present i...
The accurate detection of Web robot sessions from a web server log is essential to take accurate tra...
Antelman et al. (2005) hand-tested the accuracy of the algorithm that Hajjem et al.'s (2005) softwar...
This paper examines the use of "Robot Exclusion Protocol" to restrict the access of search engine ro...
To identify robots and human users in web archives, we conducted a study using the access logs from ...
The article deals with a study of web-crawler behaviour on different websites. A classification of w...
Open repositories are open to EVERYONE. Unfortunately, that includes machines, or robots / “bots”, r...
Human nature is greedy to follow less effort heuristics in seeking of scientific literature. Despite...