Documents are often marked up in XML-based tagsets to delineate major structural components such as headings, paragraphs, figure captions and so on, without much regard to their eventual displayed appearance. And yet these same abstract documents, after many transformations and 'typesetting' processes, often emerge in the popular format of Adobe PDF, either for dissemination or archiving. Until recently PDF has been a totally display-based document representation, relying on the underlying PostScript semantics of PDF. Early versions of PDF had no mechanism for retaining any form of abstract document structure but recent releases have now introduced an internal structure tree to create the so called 'Tagged PDF'. This paper describes the d...
As a result of an extensive investigation into the existing solutions to this problem, it has been d...
Information can include text, pictures and signatures that can be scanned into a document format, su...
As collections of archived digital documents continue to grow the maintenance of an archive, and the...
This paper describes a tool for recombining the logical structure from an XML document with the type...
Document representations can rapidly become unwieldy if they try to encapsulate all possible documen...
It is just over 20 years since Adobe's PostScript opened a new era in digital documents. PostScript ...
The Portable Document Format (PDF), defined by Adobe Systems Inc. as the basis of its Acrobat produc...
Portable Document Format (PDF) is a page-oriented, graphically rich format based on PostScript seman...
A strategy for document analysis is presented which uses Portable Document Format (PDF the underlyin...
The transformation of scanned paper documents to a form suitable for an Internet browser is a comple...
The paper PDF Document Format Features for Document Management and Distribution describes the core o...
This article presents Xed, a reverse engineering tool for PDF documents, which extracts the original...
document image analysis system that can transform paper documents into XML format [1]. An effective ...
The Portable Document Format (PDF) is a page-oriented, graphically rich document format based on Pos...
Shikano xml2tex is a framework to give XML a presentation layer using LATEX. In other words, xml2tex...
As a result of an extensive investigation into the existing solutions to this problem, it has been d...
Information can include text, pictures and signatures that can be scanned into a document format, su...
As collections of archived digital documents continue to grow the maintenance of an archive, and the...
This paper describes a tool for recombining the logical structure from an XML document with the type...
Document representations can rapidly become unwieldy if they try to encapsulate all possible documen...
It is just over 20 years since Adobe's PostScript opened a new era in digital documents. PostScript ...
The Portable Document Format (PDF), defined by Adobe Systems Inc. as the basis of its Acrobat produc...
Portable Document Format (PDF) is a page-oriented, graphically rich format based on PostScript seman...
A strategy for document analysis is presented which uses Portable Document Format (PDF the underlyin...
The transformation of scanned paper documents to a form suitable for an Internet browser is a comple...
The paper PDF Document Format Features for Document Management and Distribution describes the core o...
This article presents Xed, a reverse engineering tool for PDF documents, which extracts the original...
document image analysis system that can transform paper documents into XML format [1]. An effective ...
The Portable Document Format (PDF) is a page-oriented, graphically rich document format based on Pos...
Shikano xml2tex is a framework to give XML a presentation layer using LATEX. In other words, xml2tex...
As a result of an extensive investigation into the existing solutions to this problem, it has been d...
Information can include text, pictures and signatures that can be scanned into a document format, su...
As collections of archived digital documents continue to grow the maintenance of an archive, and the...