A Turbocharged Web Crawler built on V8 ECSE 420-- Parallel Computing Group 1 Fall 2011

Patrick Desmarais
Iain Macdonald
Guillaume Viger

Publication date

January 2016

Abstract

Our group designed and implemented a web crawler this semester. We investigated techniques that would allow us to fetch a webpage, gather the URL links on the page, and recursively repeat this procedure for all URLs discovered, storing our results in a graph data structure, with nodes representing web pages, and directed edges links between these webpages. This is by no means an original endeavour; there are many companies using web crawlers, including Google, Microsoft, Baidu, and others. There are many different reasons for wanting to crawl the web, from commercial to private. One might want to index all content on the Internet, or create a graph of the interconnections between webpages, for the purpose of searching for content on the web...

Extracted data

We use cookies to provide a better user experience.

Data Protection

A Turbocharged Web Crawler built on V8 ECSE 420-- Parallel Computing Group 1 Fall 2011

Abstract

Extracted data

A Turbocharged Web Crawler built on V8 ECSE 420-- Parallel Computing Group 1 Fall 2011

Abstract

Extracted data

Related items

Related items