Methods for ranking World Wide Web resources according to their position in the link structure of the Web are receiving considerable attention, because they provide the first e...
We consider the problem of sampling URLs uniformly at random from the Web. A tool for sampling URLs uniformly can be used to estimate various properties of Web pages, such as the ...
Monika Rauch Henzinger, Allan Heydon, Michael Mitz...
With over 800 million pages covering most areas of human endeavor, the World-wide Web is a fertile ground for data mining research to make a di erence to the e ectiveness of infor...
For languages with rich content over the web, business reviews are easily accessible via many known websites, e.g., Yelp.com. For languages with poor content over the web like Arab...
The emergence of general audience digital libraries (GADLs) defines a context that represents a hybrid of both ``traditional'' IR, using primarily bibliographic resource...