Abstract: We propose a new system that is able to handle the entire Personal Dataspace of a user. A Personal Dataspace includes all data pertaining to a user on all his disks and o...
Jens-Peter Dittrich, Lukas Blunschi, Markus Fä...
We study in this paper the Web forum crawling problem, which is a very fundamental step in many Web applications, such as search engine and Web data mining. As a typical user-crea...
Rui Cai, Jiang-Ming Yang, Wei Lai, Yida Wang, Lei ...
The rapid growth of the web has been noted and tracked extensively. Recent studies have however documented the dual phenomenon: web pages have small half lives, and thus the web e...
Ziv Bar-Yossef, Andrei Z. Broder, Ravi Kumar, Andr...
The representation of information collections needs to be optimized for human cognition. While documents often include rich visual components, collections, including personal coll...
Open source intelligence analysts routinely use the web as a source of information related to their specific taskings. Effective information gathering on the web, despite the prog...