Abstract: Dynamic HTML documents on the Internet contain useful information that can be reused by other applications. Unlike XML documents, the problem with HTML documents is that they do not have any semantics for the data in the page. Although a programmer can write a program that retrieves a peace of information from a specific HTML document available on the Internet, it will be very difficult to write several different programs to retrieve information from different dynamic HTML pages with varying formats. This paper develops a simple and generic approach to retrieve dynamic HTML Internet-based information. In this approach, several techniques that can be used to retrieve data from dynamic HTML documents are developed. These techniques ...