An empirical study has been conducted investigating the relationship between the performance of a generative language model in terms of perplexity and the corresponding informatio...
Leif Azzopardi, Mark Girolami, Keith van Rijsberge...
Stemming can improve retrieval accuracy, but stemmers are language-specific. Character n-gram tokenization achieves many of the benefits of stemming in a language independent way,...
In this paper, we introduce the fractal summarization model based on the fractal theory. In fractal summarization, the important information is captured from the source text by ex...
We demonstrate a prototype distributed architecture for a digital library, using technology being developed under the MIX Project at the San Diego Supercomputer Center (SDSC) and ...
Chaitanya K. Baru, Vincent Chu, Amarnath Gupta, Be...
To evaluate the diversity of search results, test collections have been developed that identify multiple intents for each query. Intents are the different meanings or facets that...