The information on the World Wide Web is growing without bound. Users may have very diversified preferences in the pages they target through a search engine. It is therefore a chal...
Qingzhao Tan, Xiaoyong Chai, Wilfred Ng, Dik Lun L...
We study efficient query processing in distributed web search engines with global index organization. The main performance bottleneck in this case is due to the large amount of i...
The poster describes a fast, simple, yet accurate method to associate large amounts of web resources stored in a search engine database with geographic locations. The method uses ...
The retrieval of similar documents from large scale datasets has been the one of the main concerns in knowledge management environments, such as plagiarism detection, news impact a...
Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...
Researchers of commercial search engines often collect data using the application programming interface (API) or by "scraping" results from the web user interface (WUI),...