Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...
Link analysis is a key technology in contemporary web search engines. Most of the previous work on link analysis only used information from one snapshot of web graph. Since commer...
Lei Yang, Lei Qi, Yan-Ping Zhao, Bin Gao, Tie-Yan ...
We present four approaches to the Amharic - French bilingual track at CLEF 2005. All experiments use a dictionary based approach to translate the Amharic queries into French Bags-...
This paper describes the participation of Columbus Project of Microsoft Research Asia (MSRA) in the GeoCLEF 2006 (a cross-language geographical retrieval track which is part of Cr...
Zhisheng Li, Chong Wang 0002, Xing Xie, Xufa Wang,...
In recent years there has been considerable interest in analyzing random graph models for the Web. We consider two such models - the Random Surfer model, introduced by Blum et al....