Business interoperation is important especially in electronic business. It requires the integration of business information, business documents and business processes. Nevertheles...
We investigate four hierarchical clustering methods (single-link, complete-link, groupwise-average, and single-pass) and two linguistically motivated text features (noun phrase he...
Vasileios Hatzivassiloglou, Luis Gravano, Ankineed...
We reveal that the Okapi BM25 retrieval function tends to overly penalize very long documents. To address this problem, we present a simple yet effective extension of BM25, namel...
We present a novel approach to relation extraction that integrates information across documents, performs global inference and requires no labelled text. In particular, we tackle ...
The Mixed Raster Content (MRC) document compression standard (ITU T.44) specifies a multi-layer multi-resolution representation of a compound document. The model is very efficie...