—In this paper, we present a novel approach to search and retrieve from document image collections, without explicit recognition. Existing recognition-free approaches such as wor...
—We propose a novel method to evaluate table segmentation results based on a table image ground truther. In the ground-truthing process, we first extract connected components fr...
More and more fonts have sprung up in recent years in digital publishing industry and reading devices. In this paper, we focus on methods of evaluating digital Chinese fonts and t...
—The goal of this work is to add the capability to segment documents containing text, graphics, and pictures in the open source OCR engine OCRopus. To achieve this goal, OCRopus...
Amy Winder, Tim L. Andersen, Elisa H. Barney Smith
Word searching and indexing in historical document collections is a challenging problem because, characters in these documents are often touching or broken due to degradation/agei...