Generating query-biased summaries can take up a large part of the response time of interactive information retrieval (IIR) systems. This paper proposes to use document titles as a...
This paper investigates the role of ontologies as a central part of an architecture to repurpose existing material from the web. A prototype system called ArtEquAKT is presented, ...
Mark J. Weal, Harith Alani, Sanghee Kim, Paul H. L...
The traditional weighting schemes used in text categorization for the vector space model (VSM) cannot exploit information intrinsic to texts obtained through on-line handwriting r...
We introduce a generative probabilistic document model based on latent Dirichlet allocation (LDA), to deal with textual errors in the document collection. Our model is inspired by...
The widespread use of XML brings out the need of ensuring the validity of XML data. The use of languages such as XML Schema makes easier the process of verification of XML documen...