Document image segmentation algorithms primarily aim at separating text and graphics in presence of complex layouts. However, for many non-Latin scripts, segmentation becomes a ch...
In our research work, we consider that access to semi-structured documents is carried out by a data-oriented query. With different users and a same query, the returned results are ...
Keyword search is an effective approach for most users to search for information because they do not need to learn complex query languages or the underlying structures of the data....
This paper proposes a novel approach to measuring XML document similarity by taking into account the semantics between XML elements. The motivation of the proposed approach is to ...
Cheap and versatile cameras make it possible to easily and quickly capture a wide variety of documents. However, low resolution cameras present a challenge to OCR because it is vi...
Charles E. Jacobs, Patrice Y. Simard, Paul A. Viol...