The paper presents in brief a methodology for development of tools for knowledge-based search in repositories of digitized manuscripts. It is designated to assist the search activ...
Large-scale web and text retrieval systems deal with amounts of data that greatly exceed the capacity of any single machine. To handle the necessary data volumes and query through...
This paper presents an Enhanced Heuristic Segmenter (EHS) and an improved neural-based segmentation technique for segmenting cursive words and validating prospective segmentation ...
A trainable method for distinguishing between mathematics notation and natural language (here, English) in images of textlines, using computational geometry methods only with no a...
The Dutch Tax and Customs Administration (DTCA) is one of many organizations that deal with a multitude of electronic legal data, from various sources and in different formats. In...
Radboud Winkels, Alexander Boer, Emile de Maat, To...