Term-based representations of documents have found widespread use in information retrieval. However, one of the main shortcomings of such methods is that they largely disregard le...
This paper describes an approach to digesting threads of archived discussion lists by clustering messages into approximate topical groups, and then extracting shorter overviews, a...
We propose a method of classifying XML documents and extracting XML schema from XML by inductive inference based on constraint logic programming. The goal of this work is to type ...
Abstract. Pseudo-Relevance Feedback (PRF) assumes that the topranking n documents of the initial retrieval are relevant and extracts expansion terms from them. In this work, we int...
We consider the problem of modeling the content structure of texts within a specific domain, in terms of the topics the texts address and the order in which these topics appear. W...