Consider the task of exploring the Web in order to find pages of a particular kind or on a particular topic. This task arises in the construction of domain-specific search engines. A selective, directed web spider can be much more efficient than a spider that gathers new pages indiscriminantly. This paper argues that the creation of efficient web spiders is best framed and solved by reinforcement learning, a branch of machine learning that concerns itself with optimal sequential decision making. One strength of reinforcement learning is that it provides a formalism for measuring the utility of actions that give no immediate benefit, but do give benefit in the future. Topic-specific spidering fits into the reinforcement learning framework be...
In recent years, the World Wide Web has shown enormous growth in size. Vast repositories of informat...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
Artificial Intelligence Lab, Department of MIS, University of ArizonaSpiders are the software agents...
Consider the task of exploring the Web in order to find pages of a particular kind or on a particula...
International audienceFocused crawling aims at collecting as many Web pages relevant to a target top...
A focused crawler aims at discovering as many web pages relevant to a target topic as possible, whil...
Domain-specific search engines are growing in popularity because they offer increased accuracy and e...
We propose a novel deep web crawling framework based on reinforcement learning. The crawler is regar...
Testing web applications through the GUI can be complex and time-consuming, as it involves checking ...
Abstract. In this paper we compare our selection based learning algo-rithm with the reinforcement le...
Domain-specific search engines are becoming increasingly popular because they offer increased accura...
Abstract. Domain-specific internet portals are growing in popularity because they gather content fro...
Machine learning plays a pivotal role in artificial intelligence, allowing machines to mimic human l...
Given a database with missing or uncertain information, our goal is to extract specific information ...
The Web's dynamic,.unstructured nature makes locating resources difficult. Vertical search engines s...
In recent years, the World Wide Web has shown enormous growth in size. Vast repositories of informat...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
Artificial Intelligence Lab, Department of MIS, University of ArizonaSpiders are the software agents...
Consider the task of exploring the Web in order to find pages of a particular kind or on a particula...
International audienceFocused crawling aims at collecting as many Web pages relevant to a target top...
A focused crawler aims at discovering as many web pages relevant to a target topic as possible, whil...
Domain-specific search engines are growing in popularity because they offer increased accuracy and e...
We propose a novel deep web crawling framework based on reinforcement learning. The crawler is regar...
Testing web applications through the GUI can be complex and time-consuming, as it involves checking ...
Abstract. In this paper we compare our selection based learning algo-rithm with the reinforcement le...
Domain-specific search engines are becoming increasingly popular because they offer increased accura...
Abstract. Domain-specific internet portals are growing in popularity because they gather content fro...
Machine learning plays a pivotal role in artificial intelligence, allowing machines to mimic human l...
Given a database with missing or uncertain information, our goal is to extract specific information ...
The Web's dynamic,.unstructured nature makes locating resources difficult. Vertical search engines s...
In recent years, the World Wide Web has shown enormous growth in size. Vast repositories of informat...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
Artificial Intelligence Lab, Department of MIS, University of ArizonaSpiders are the software agents...