We present a series of experiments on automatically identifying the sense of implicit discourse relations, i.e. relations that are not marked with a discourse connective such as &...
The problem addressed in this paper is to segment a given multilingual document into segments for each language and then identify the language of each segment. The problem was mot...
This paper describes work on Named Entity Recognition (NER), in preparation for Relation Extraction (RE), on data from a historical archive organisation. As is often the case in t...
Cheap and versatile cameras make it possible to easily and quickly capture a wide variety of documents. However, low resolution cameras present a challenge to OCR because it is vi...
Charles E. Jacobs, Patrice Y. Simard, Paul A. Viol...
When recognizing multiple fonts, geometric features, such as the directional information of strokes, are generally robust against deformation but are weak against degradation. Thi...