Multi-document discourse analysis has emerged with the potential of improving various NLP applications. Based on the newly proposed Cross-document Structure Theory (CST), this pap...
In this paper, we present a framework for coupling an existing formatting system such as SMIL [7] and Madeus [13] with a formatting control system XEF [10]. This framework allows ...
Generation of ground-truths is of great importance for unbiased performance evaluation of document layout analysis methods. This is especially necessary because many methods are c...
The need for incremental constraint maintenance within collections of semi-structured documents has been ever increasing in the last years due to the widespread diffusion of XML. T...
A new approach for constructing pseudo-keywords, referred to as Sense Units, is proposed. Sense Units are obtained by a word clustering process, where the underlying similarity re...