Given a large-scale linked document collection, such as a collection of blog posts or a research literature archive, there are two fundamental problems that have generated a lot o...
We present an approach to improving the precision of an initial document ranking wherein we utilize cluster information within a graph-based framework. The main idea is to perform...
This paper presents a framework for user-oriented text mining. It is then illustrated with an example of discovering knowledge from competitors’ websites. The knowledge to be di...
Abstract— Replacement policies for general caching applications and Web caching in particular have been discussed extensively in the literature. Many ad-hoc policies have been pr...
We present a non-traditional retrieval problem we call subtopic retrieval. The subtopic retrieval problem is concerned with finding documents that cover many different subtopics ...
ChengXiang Zhai, William W. Cohen, John D. Laffert...