Networking homes, offices, cars and hand-held computers is the current trend of distributed mobile computing. The ever growing demand of the enterprise for integrating new technol...
Caching is a standard solution to the problem of insufcient bandwidth caused by the rapid increase of information circulation across the Internet. Cache consistency mechanisms are...
In this paper we present the World-Wide Web Wrapper Factory (W4F), a Java toolkit to generate wrappers for Web data sources. Some key features of W4F are an expressive language to...
Numerous approaches, including textual, structural and featural, to detecting duplicate documents have been investigated. Considering document images are usually stored and transm...
Data mining applications place special requirements on clustering algorithms including: the ability to nd clusters embedded in subspaces of high dimensional data, scalability, end...
Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopul...