Sciweavers

3251 search results - page 214 / 651
» Challenges in Web Information Retrieval
Sort
View
WWW
2007
ACM
16 years 7 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma
JCDL
2009
ACM
168views Education» more  JCDL 2009»
16 years 1 months ago
A framework for describing web repositories
In prior work we have demonstrated that search engine caches and archiving projects like the Internet Archive’s Wayback Machine can be used to “lazily preserve” websites and...
Frank McCown, Michael L. Nelson
PKDD
2004
Springer
91views Data Mining» more  PKDD 2004»
15 years 12 months ago
Summarization of Dynamic Content in Web Collections
This paper describes a new research proposal of multi-document summarization of dynamic content in web pages. Much information is lost in the Web due to the temporal character of w...
Adam Jatowt, Mitsuru Ishizuka
ICWE
2009
Springer
15 years 10 months ago
Semantic web access prediction using WordNet
The user observed latency of retrieving Web documents is one of limiting factors while using the Internet as an information data source. Prefetching became important technique ...
Lenka Hapalova
ACMDIS
2000
ACM
15 years 11 months ago
Browsers with Changing Parts: A Catalog Explorer for Philip Glass' Website
The development of navigational tools for a web site devoted to a catalog of musical compositions offers a variety of design challenges. A combination of techniques developed from...
Harry Hochheiser