The thesis treats automatic extraction of semantic data from Web pages. Within this broad problem, it focuses on finding values of data figures within the page presenting certain entity (e.g. price of a laptop). The main idea we wanted to evaluate is that a figure can be found using its context in the page: the words that surround it and values of the attributes of the containing HTML tags, class attribute in particular. Our research revealed there are two types of contemporary solutions of this problem: either the author of the Web page must inline semantic information inside the markup of the page or there are commercial tools that can be trained to parse a particular page format (targetting pages from a single Web domain). We examined th...
Web extraction is the task of turning unstructured HTML into structured data. Previous approaches re...
Abstract: Deep Web contents are accessed by queries submitted to Web databases and the returned data...
We propose a method for extracting at-tributes and their values from Web pages. Our method makes use...
Abstract. Extracting data from web pages using wrappers is a fundamental problem arising in a large ...
With the explosive growth of information sources available on the World Wide Web, it has become incr...
We study possibilities to automatically extract information from the Internet, by structuring and co...
We study possibilities to automatically extract information from the Internet, by structuring and co...
We study possibilities to automatically extract information from the Internet, by structuring and co...
We study possibilities to automatically extract information from the Internet, by structuring and co...
Abstract: Problems statement: Nowadays, many users use web search engines to find and gather informa...
http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/This article presents a system to extr...
http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/This article presents a system to extr...
An important aspect of research for Web information extraction relates to the inference of complex r...
An important aspect of research for Web information extraction relates to the inference of complex r...
The Internet could be considered to be a reservoir of useful information in textual form — product c...
Web extraction is the task of turning unstructured HTML into structured data. Previous approaches re...
Abstract: Deep Web contents are accessed by queries submitted to Web databases and the returned data...
We propose a method for extracting at-tributes and their values from Web pages. Our method makes use...
Abstract. Extracting data from web pages using wrappers is a fundamental problem arising in a large ...
With the explosive growth of information sources available on the World Wide Web, it has become incr...
We study possibilities to automatically extract information from the Internet, by structuring and co...
We study possibilities to automatically extract information from the Internet, by structuring and co...
We study possibilities to automatically extract information from the Internet, by structuring and co...
We study possibilities to automatically extract information from the Internet, by structuring and co...
Abstract: Problems statement: Nowadays, many users use web search engines to find and gather informa...
http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/This article presents a system to extr...
http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/This article presents a system to extr...
An important aspect of research for Web information extraction relates to the inference of complex r...
An important aspect of research for Web information extraction relates to the inference of complex r...
The Internet could be considered to be a reservoir of useful information in textual form — product c...
Web extraction is the task of turning unstructured HTML into structured data. Previous approaches re...
Abstract: Deep Web contents are accessed by queries submitted to Web databases and the returned data...
We propose a method for extracting at-tributes and their values from Web pages. Our method makes use...