Sciweavers

1675 search results - page 196 / 335
» Surfing the web by site
Sort
View
WWW
2006
ACM
16 years 12 days ago
Do not crawl in the DUST: different URLs with similar text
We consider the problem of dust: Different URLs with Similar Text. Such duplicate URLs are prevalent in web sites, as web server software often uses aliases and redirections, and...
Uri Schonfeld, Ziv Bar-Yossef, Idit Keidar
ADC
2005
Springer
183views Database» more  ADC 2005»
16 years 10 hour ago
Discovering User Access Pattern Based on Probabilistic Latent Factor Model
There has been an increased demand for characterizing user access patterns using web mining techniques since the informative knowledge extracted from web server log files can not ...
Guandong Xu, Yanchun Zhang, Jiangang Ma, Xiaofang ...
IICS
2004
Springer
15 years 11 months ago
Towards Logical Hypertext Structure
Facing the retrieval problem according to the overwhelming set of documents online the adaptation of text categorization to web units has recently been pushed. The aim is to utiliz...
Alexander Mehler, Matthias Dehmer, Rüdiger Gl...
WIDM
2003
ACM
15 years 11 months ago
Datarover: a taxonomy based crawler for automated data extraction from data-intensive websites
The advent of e-commerce has created a trend that brought thousands of catalogs online. Most of these websites are “taxonomy-directed”. A Web site is said to be ``taxonomydire...
Hasan Davulcu, S. Koduri, Saravanakumar Nagarajan
JASIS
1998
99views more  JASIS 1998»
15 years 6 months ago
Electronic News Delivery Project
An appreciation of the roles of genre and task is important in understanding how people browse the Web. Genre is characterized by content and form and is intimately linked to the ...
Carolyn R. Watters, Michael A. Shepherd, Forbes J....