Wrapping Data into XML

  • Wei Han
  • David Buttler
  • Calton Pu
Publication date
September 2015

Abstract

The vast majority of information that is available on-line, and coming online in this near future is only avail-able in HTML. In order to use this information for more than human browsing, it must be converted into a machine-readable format. Wrappers have been the key tool to make the conversion from HTML into se-mantically meaningful and well-structured XML data. However, developing wrappers is slow and tedious work with typically brittle results. This paper de-scribes XWRAP Elite, a tool to automatically gen-erate robust wrappers, which breaks down the conver-sion process into three procedures: discovering where the data is located in an HTML page and separating the data into individual objects; decomposing objects into data elements; mar...

Extracted data

Topics

thumbnail of dbpedia resource
XMLLanguage
thumbnail of dbpedia resource
HTMLLanguage
We use cookies to provide a better user experience.