Sciweavers

2423 search results - page 191 / 485
» Hypertext Information Retrieval for the Web
Sort
View
WWW
2007
ACM
16 years 7 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma
JCDL
2009
ACM
168views Education» more  JCDL 2009»
16 years 1 months ago
A framework for describing web repositories
In prior work we have demonstrated that search engine caches and archiving projects like the Internet Archive’s Wayback Machine can be used to “lazily preserve” websites and...
Frank McCown, Michael L. Nelson
PKDD
2004
Springer
91views Data Mining» more  PKDD 2004»
15 years 12 months ago
Summarization of Dynamic Content in Web Collections
This paper describes a new research proposal of multi-document summarization of dynamic content in web pages. Much information is lost in the Web due to the temporal character of w...
Adam Jatowt, Mitsuru Ishizuka
CIKM
2005
Springer
15 years 8 months ago
Fast on-line index construction by geometric partitioning
Inverted index structures are the mainstay of modern text retrieval systems. They can be constructed quickly using off-line mergebased methods, and provide efficient support for ...
Nicholas Lester, Alistair Moffat, Justin Zobel
HT
2011
ACM
14 years 10 months ago
Bridging link and query intent to enhance web search
Understanding query intent is essential to generating appropriate rankings for users. Existing methods have provided customized rankings to answer queries with different intent. W...
Na Dai, Xiaoguang Qi, Brian D. Davison