Goals of this work are design and implementation of an application which will allow efective data extraction from HTML pages. Emphasis is put on maximal utilization of existing XML technologies. Resulting application is based on XQuery language, which is extended by options allowing to work with web pages and combines it with other technologies for searching for relevant parts in free text. At the same time, it allows the usage of XSLT language for transformation of data into the required form. Application contains command-line, graphical and server interface, which is accompanied by user extension for Mozilla Firefox 3 web browser. Command-line interface allows the batch processing of queries whereas the graphical interface o ers user frie...
This master thesis is focused on current technologies that are used for downloading web pages and ex...
This thesis evaluates XQuery as a complete solution for data storage and processing, application log...
The ability of the end user to work with a large amount of data from a large number of heterogeneous...
Goals of this work are design and implementation of an application which will allow efective data ex...
This thesis presents a mechanism based on eXtensible Markup Language (XML) to extract data from HTML...
This work describes scope of creating application for extraction and following data from HTML sites....
This work contains a brief overview of technologies for representation and obtaining data on WWW and...
This thesis deals with data extraction from web pages created in HTML language. It describes methods...
With the development of the Internet, the World Wide Web has become an invaluable information source...
In this thesis I focused myself on the increasing importance of an automatic web page processing. Th...
International audienceThe process of data extraction from internet sources have beenoriginating the ...
Abstract. The Word Wide Web has becoming one of the most important information repositories. However...
This work focus at data and especially text mining from Web pages, an overview of programs for downl...
Web environment has developed into the largest source of electronic documents, so it would be very u...
We present new techniques for supervised wrapper generation and automated web information extraction...
This master thesis is focused on current technologies that are used for downloading web pages and ex...
This thesis evaluates XQuery as a complete solution for data storage and processing, application log...
The ability of the end user to work with a large amount of data from a large number of heterogeneous...
Goals of this work are design and implementation of an application which will allow efective data ex...
This thesis presents a mechanism based on eXtensible Markup Language (XML) to extract data from HTML...
This work describes scope of creating application for extraction and following data from HTML sites....
This work contains a brief overview of technologies for representation and obtaining data on WWW and...
This thesis deals with data extraction from web pages created in HTML language. It describes methods...
With the development of the Internet, the World Wide Web has become an invaluable information source...
In this thesis I focused myself on the increasing importance of an automatic web page processing. Th...
International audienceThe process of data extraction from internet sources have beenoriginating the ...
Abstract. The Word Wide Web has becoming one of the most important information repositories. However...
This work focus at data and especially text mining from Web pages, an overview of programs for downl...
Web environment has developed into the largest source of electronic documents, so it would be very u...
We present new techniques for supervised wrapper generation and automated web information extraction...
This master thesis is focused on current technologies that are used for downloading web pages and ex...
This thesis evaluates XQuery as a complete solution for data storage and processing, application log...
The ability of the end user to work with a large amount of data from a large number of heterogeneous...