Recent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchron...
Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey...
Relational clustering has attracted more and more attention due to its phenomenal impact in various important applications which involve multi-type interrelated data objects, such...
Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...
The rich dependency structure found in the columns of real-world relational databases can be exploited to great advantage, but can also cause query optimizers--which usually assum...
Ihab F. Ilyas, Volker Markl, Peter J. Haas, Paul B...
The state of the art commercial query optimizers employ cost-based optimization and exploit dynamic programming (DP) to find the optimal query execution plan (QEP) without evalua...