We study in this paper the Web forum crawling problem, which is a very fundamental step in many Web applications, such as search engine and Web data mining. As a typical user-crea...
Rui Cai, Jiang-Ming Yang, Wei Lai, Yida Wang, Lei ...
Determining the user intent of Web searches is a difficult problem due to the sparse data available concerning the searcher. In this paper, we examine a method to determine the us...
Bernard J. Jansen, Danielle L. Booth, Amanda Spink
Clio is an existing schema-mapping tool that provides user-friendly means to manage and facilitate the complex task of transformation and integration of heterogeneous data such as...
Haifeng Jiang, Howard Ho, Lucian Popa, Wook-Shin H...
Given a set of machines and a set of Web applications with dynamically changing demands, an online application placement controller decides how many instances to run for each appl...
Chunqiang Tang, Malgorzata Steinder, Mike Spreitze...
We revisit a problem introduced by Bharat and Broder almost a decade ago: how to sample random pages from the corpus of documents indexed by a search engine, using only the search...