This article presents Xed, a reverse engineering tool for PDF documents, which extracts the original document layout structure. Xed mixes electronic extraction methods with state-...
This paper is concerned with the integration of recursive data structures with Presburger arithmetic. The integrated theory includes a length function on data structures, thus prov...
Structured documents, especially the XML documents, are made up of a few logical components, such as title, sections, subsections and paragraphs. The components in each structured...
Successful applications of digital libraries require structured access to sources of information. This paper presents an approach to extract the logical structure of text document...
Substantial efforts have been made in order to cope with disjunctions in constraint based grammar formalisms (e.g. [Kasper, 1987; Maxwell and Kaplan, 1991; DSrre and Eisele, 1990]...