This paper proposes a syntactic method for detection and correction of misrecognized mathematical formulae for a practical mathematical OCR system. Linear monadic context-free tre...
A significant amount of information is stored in computer systems today, but people are struggling to manage their documents such that the information is easily found. XML is a de...
This paper presents an iterative method for generative semantic clustering of related information elements in spatial hypertext documents. The goal is to automatically organize th...
Andruid Kerne, Eunyee Koh, Vikram Sundaram, J. Mic...
Many museum and library archives are digitizing their large collections of handwritten historical manuscripts to enable public access to them. These collections are only available...
Distributed heterogeneous search systems are an emerging phenomenon in Web search, in which independent topic-specific search engines provide search services, and metasearchers d...