Ranking microblogs, such as tweets, as search results for a query is challenging, among other things because of the sheer amount of microblogs that are being generated in real time...
This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis ove...
Distributed heterogeneous search systems are an emerging phenomenon in Web search, in which independent topic-specific search engines provide search services, and metasearchers d...
In recent years, there has been a prevalence of search engines being employed to find useful information in the Web as they efficiently explore hyperlinks between web pages which ...
Zhenglu Yang, Lin Li, Botao Wang, Masaru Kitsurega...
In this paper, we identify and analyze structural properties which reflect the functionality of a Web site. These structural properties consider the size, the organization, the co...