Many documents are available to a computer only as images from paper. However, most natural language processing systems expect their input as character-coded text, which may be di...
When scanning documents with a large number of pages such as books, it is often feasible to provide a minimal number of training samples to personalize the system to compensate fo...
Abstract. The ability to learn from user interaction is an important asset for content-based image retrieval (CBIR) systems. Over short times scales, it enables the integration of ...
The proliferation of content-based image retrieval techniques has highlighted the need to understand the relationship between image clustering based on low-Ievel imagefeatures and...
Abstract—This paper presents an adaptive algorithm for preprocessing document images prior to binarization in character recognition problems. Our method is similar in its approac...