PageRank is known to be an efficient metric for computing general document importance in the Web. While commonly used as a one-size-fits-all measure, the ability to produce topica...
We propose a model for user purchase behavior in online stores that provide recommendation services. We model the purchase probability given recommendations for each user based on...
Given a huge online social network, how do we retrieve information from it through crawling? Even better, how do we improve the crawling performance by using parallel crawlers tha...
Duen Horng Chau, Shashank Pandit, Samuel Wang, Chr...
In this paper we study the privacy preservation properties of a specific technique for query log anonymization: tokenbased hashing. In this approach, each query is tokenized, and ...
Ravi Kumar, Jasmine Novak, Bo Pang, Andrew Tomkins
Researchers of commercial search engines often collect data using the application programming interface (API) or by "scraping" results from the web user interface (WUI),...