The results of the Web query log analysis may be significantly shifted depending on the fraction of agents (non-human clients), which are not excluded from the log. To detect and ...
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Previous work on understanding user web search behavior has focused on how people search and what they are searching for, but not why they are searching. In this paper, we describ...
The top web search result is crucial for user satisfaction with the web search experience. We argue that the importance of the relevance at the top position necessitates special h...
Numerous widely publicized cases of theft and misuse of private information underscore the need for audit technology to identify the sources of unauthorized disclosure. We present...
Rakesh Agrawal, Alexandre V. Evfimievski, Jerry Ki...