Abstract. There is a great amount of information on the web that can not be ac-cessed by conventional crawler engines. This portion of the web is usually known as the Hidden Web. To be able to deal with this problem, it is necessary to solve two tasks: crawling the client-side and crawling the server-side hidden web. In this paper we present an architecture and a set of related techniques for accessing the information placed in web pages with support for client-side dynamism, dealing with aspects such as JavaScript technology, non-standard session maintenance mechanisms, client redirections, pop-up menus, etc. Our approach leverages current browser APIs and implements novel crawling mod-els and algorithms.
Using JavaScript and dynamic DOM manipulation on the client-side of web applications is becoming a w...
AJAX is a very promising approach for improving rich interactivity and responsiveness of web applica...
Abstract. Client-side JavaScript is increasingly used for enhancing web application functionality, i...
There is a great amount of information on the web that can not be accessed by conventional crawler e...
Client-side JavaScript is increasingly used for enhancing web application functionality, interactivi...
Web Crawler forms the back-bone of applications that facilitate Web information retrieval. Generic c...
In this paper, Web Crawling systems are investigated. Such systems are mostly used in Web archiving,...
Abstract: In this paper, we put forward a technique for parallel crawling of the web. The World Wide...
Web application scanners are popular tools to perform black box testing and are widely used to disco...
Web applications have come a long way both in terms of adoption to provide information and services ...
During the past decade, web applications have evolved substantially. Taking advantage of new technol...
The traditional crawlers used by search engines to build their collection of Web pages frequently ga...
The number of applications that need to crawl the Web to gather data is growing at an ever increasin...
JavaScript Client-side hidden web pages (CSHW) contain dynamic material created as a result of speci...
Web crawlers visit internet applications, collect data, and learn about new web pages from visited p...
Using JavaScript and dynamic DOM manipulation on the client-side of web applications is becoming a w...
AJAX is a very promising approach for improving rich interactivity and responsiveness of web applica...
Abstract. Client-side JavaScript is increasingly used for enhancing web application functionality, i...
There is a great amount of information on the web that can not be accessed by conventional crawler e...
Client-side JavaScript is increasingly used for enhancing web application functionality, interactivi...
Web Crawler forms the back-bone of applications that facilitate Web information retrieval. Generic c...
In this paper, Web Crawling systems are investigated. Such systems are mostly used in Web archiving,...
Abstract: In this paper, we put forward a technique for parallel crawling of the web. The World Wide...
Web application scanners are popular tools to perform black box testing and are widely used to disco...
Web applications have come a long way both in terms of adoption to provide information and services ...
During the past decade, web applications have evolved substantially. Taking advantage of new technol...
The traditional crawlers used by search engines to build their collection of Web pages frequently ga...
The number of applications that need to crawl the Web to gather data is growing at an ever increasin...
JavaScript Client-side hidden web pages (CSHW) contain dynamic material created as a result of speci...
Web crawlers visit internet applications, collect data, and learn about new web pages from visited p...
Using JavaScript and dynamic DOM manipulation on the client-side of web applications is becoming a w...
AJAX is a very promising approach for improving rich interactivity and responsiveness of web applica...
Abstract. Client-side JavaScript is increasingly used for enhancing web application functionality, i...