Sciweavers

3409 search results - page 353 / 682
» Web search clickstreams
Sort
View
WWW
2006
ACM
16 years 18 days ago
Do not crawl in the DUST: different URLs with similar text
We consider the problem of dust: Different URLs with Similar Text. Such duplicate URLs are prevalent in web sites, as web server software often uses aliases and redirections, and...
Uri Schonfeld, Ziv Bar-Yossef, Idit Keidar
APWEB
2006
Springer
15 years 10 months ago
Automatically Constructing Descriptive Site Maps
Rapid increase in the number of pages on web sites, and widespread use of search engine optimization techniques, lead to web sites becoming difficult to navigate. Traditional site ...
Pavel Dmitriev, Carl Lagoze
CIKM
2008
Springer
15 years 8 months ago
Indexing and retrieval of a Greek corpus
Greek is one of the most difficult languages to handle in Web Information Retrieval (IR) related tasks. Its difficulty stems from the fact that it is grammatically, morphologicall...
Georgios Paltoglou, Michail Salampasis, Fotis Laza...
WWW
2009
ACM
16 years 7 months ago
A densitometric analysis of web template content
What makes template content in the Web so special that we need to remove it? In this paper I present a large-scale aggregate analysis of textual Web content, corroborating statist...
Christian Kohlschütter
WWW
2007
ACM
16 years 7 months ago
Image collector III: a web image-gathering system with bag-of-keypoints
We propose a new system to mine visual knowledge on the Web. There are huge image data as well as text data on the Web. However, mining image data from the Web is paid less attent...
Keiji Yanai