In this paper, we report on a large-scale study of structural differences among the national webs. The study is based on a webscale crawl conducted in the summer 2008. More specif...
Sukwon Chung, Dungjit Shiowattana, Pavel Dmitriev,...
In a typical content-based image retrieval (CBIR) system, query results are a set of images sorted by feature similarities with respect to the query. However, images with high fea...
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
Most human activities occur around where the user is physically located. Knowing the geographical serving area of web resources, therefore, is very important for many web applicat...
Qi Zhang, Xing Xie, Lee Wang, Lihua Yue, Wei-Ying ...
We consider the problem of combining ranking results from various sources. In the context of the Web, the main applications include building meta-search engines, combining ranking...
Cynthia Dwork, Ravi Kumar, Moni Naor, D. Sivakumar