Sciweavers

24444 search results - page 4461 / 4889
» A Data Model for Data Integration
Sort
View
WWW
2008
ACM
16 years 7 months ago
Recrawl scheduling based on information longevity
It is crucial for a web crawler to distinguish between ephemeral and persistent content. Ephemeral content (e.g., quote of the day) is usually not worth crawling, because by the t...
Christopher Olston, Sandeep Pandey
173
Voted
WWW
2008
ACM
16 years 7 months ago
Mining for personal name aliases on the web
We propose a novel approach to find aliases of a given name from the web. We exploit a set of known names and their aliases as training data and extract lexical patterns that conv...
Danushka Bollegala, Taiki Honma, Yutaka Matsuo, Mi...
160
Voted
WWW
2008
ACM
16 years 7 months ago
Folksoviz: a subsumption-based folksonomy visualization using wikipedia texts
In this paper, targeting del.icio.us tag data, we propose a method, FolksoViz, for deriving subsumption relationships between tags by using Wikipedia texts, and visualizing a folk...
Kangpyo Lee, Hyunwoo Kim, Chungsu Jang, Hyoung-Joo...
217
Voted
WWW
2007
ACM
16 years 7 months ago
Towards domain-independent information extraction from web tables
Traditionally, information extraction from web tables has focused on small, more or less homogeneous corpora, often based on assumptions about the use of <table> tags. A mul...
Bernhard Krüpl, Bernhard Pollak, Marcus Herzo...
WWW
2006
ACM
16 years 7 months ago
Beyond PageRank: machine learning for static ranking
Since the publication of Brin and Page's paper on PageRank, many in the Web community have depended on PageRank for the static (query-independent) ordering of Web pages. We s...
Matthew Richardson, Amit Prakash, Eric Brill
« Prev « First page 4461 / 4889 Last » Next »