—For historical documents, available transcriptions typically are inaccurate when compared with the scanned document images. Not only the position of the words and sentences are ...
This paper reports some experiments in using SVG (Scalable Vector Graphics), rather than the browser default of (X)HTML/CSS, as a potential Web-based rendering technology, in an a...
A novel strategy for the representation and manipulation of distributed documents, potentially complex and heterogeneous, is presented in this paper. The document under the propos...
Abstract. Universal Business Language (UBL) is an OASIS initiative to develop common business document schemas to provide document interoperability in the eBusiness domain. Since t...
Logical entity recognition in heterogeneous collections of document page images remains a challenging problem since the performance of traditional supervised methods degrade drama...