The transition of personal information management (PIM) tools off the desktop to the Web presents an opportunity to augment these tools with capabilities provided by the wealth o...
Max Van Kleek, Brennan Moore, David R. Karger, Pau...
Queries on major Web search engines produce complex result pages, primarily composed of two types of information: organic results, that is, short descriptions and links to relevan...
Cristian Danescu-Niculescu-Mizil, Andrei Z. Broder...
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...
In modern Web applications, style formatting and layout calculation often account for a substantial amount of local Web page processing time. In this paper1 , we present two novel...
Kaimin Zhang, Lu Wang, Aimin Pan, Bin Benjamin Zhu