We investigate the automatic generation of topic pages as an alternative to the current Web search paradigm. We describe a general framework, which combines query log analysis to ...
Users of hypertext systems like the World Wide Web (WWW) often find themselves following hypertext links deeper and deeper, only to find themselves “lost” and unable to fin...
R. Gandhi, Benjamin B. Bederson, G. Kumar, Ben Shn...
We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...
A large number of web sites publish pages containing structured information about recognizable concepts, but these data are only partially used by current applications. Although s...
Paolo Papotti, Valter Crescenzi, Paolo Merialdo, M...
Large, socially-driven Web 2.0 sites such as Facebook and Youtube have seen significant growth in popularity [5, 10]. However, strong demand also exists for socially-driven web s...
Frank Uyeda, Diwaker Gupta, Amin Vahdat, George Va...