Sciweavers

3197 search results - page 233 / 640
» Web Engineering Revisited
Sort
View
SIGIR
2006
ACM
16 years 17 days ago
Finding near-duplicate web pages: a large-scale evaluation of algorithms
Broder et al.’s [3] shingling algorithm and Charikar’s [4] random projection based approach are considered “state-of-theart” algorithms for finding near-duplicate web pag...
Monika Rauch Henzinger
PVLDB
2008
124views more  PVLDB 2008»
15 years 6 months ago
Google's Deep Web crawl
The Deep Web, i.e., content hidden behind HTML forms, has long been acknowledged as a significant gap in search engine coverage. Since it represents a large portion of the structu...
Jayant Madhavan, David Ko, Lucja Kot, Vignesh Gana...
ACL
2010
15 years 4 months ago
Speech-Driven Access to the Deep Web on Mobile Devices
The Deep Web is the collection of information repositories that are not indexed by search engines. These repositories are typically accessible through web forms and contain dynami...
Taniya Mishra, Srinivas Bangalore

Book
612views
17 years 4 months ago
HTTP Programming Recipes for Java Bots
The book covers the following topics: examining the structure of HTTP requests, monitoring the packets being transferred between a web server and web browser, executing simple HTTP...
Jeff Heaton
WWW
2007
ACM
16 years 7 months ago
A cautious surfer for PageRank
This work proposes a novel cautious surfer to incorporate trust into the process of calculating authority for web pages. We evaluate a total of sixty queries over two large, real-...
Lan Nie, Baoning Wu, Brian D. Davison