Abstract--We attempt to evaluate the efficacy of six unsupervised evaluation method to tune Sauvola's threshold in optical character recognition (OCR) applications. We propose...
- The microscopic details of printing often are unnoticed by humans, but can make differences that affect machine recognition of printed text. Models of the defects introduced into...
The Internet makes it possible to share information (e.g. text, image, audio, video and other formats of data) across the globe. In this paper we look at collaborative Internet en...
Extracting titles from a PDFs full text is an important task in information retrieval to identify PDFs. Existing approaches apply complicated and expensive (in terms of calculating...
When one scans a document page from a thick bound volume, the curvature of the page to be scanned results in two kinds of distortion in the scanned document images: i) shade along...