Sciweavers

8301 search results - page 1419 / 1661
» Risk-Aware Information Retrieval
Sort
View
WWW
2006
ACM
16 years 7 months ago
Effective web-scale crawling through website analysis
The web crawler space is often delimited into two general areas: full-web crawling and focused crawling. We present netSifter, a crawler system which integrates features from thes...
Iván Gonzlez, Adam Marcus 0002, Daniel N. M...
WWW
2006
ACM
16 years 7 months ago
Using graph matching techniques to wrap data from PDF documents
Wrapping is the process of navigating a data source, semiautomatically extracting data and transforming it into a form suitable for data processing applications. There are current...
Tamir Hassan, Robert Baumgartner
WWW
2006
ACM
16 years 7 months ago
A probabilistic approach to spatiotemporal theme pattern mining on weblogs
Mining subtopics from weblogs and analyzing their spatiotemporal patterns have applications in multiple domains. In this paper, we define the novel problem of mining spatiotempora...
Qiaozhu Mei, Chao Liu 0001, Hang Su, ChengXiang Zh...
WWW
2006
ACM
16 years 7 months ago
GoGetIt!: a tool for generating structure-driven web crawlers
We present GoGetIt!, a tool for generating structure-driven crawlers that requires a minimum effort from the users. The tool takes as input a sample page and an entry point to a W...
Altigran Soares da Silva, Edleno Silva de Moura, J...
WWW
2006
ACM
16 years 7 months ago
The impact of online music services on the demand for stars in the music industry
The music industry's business model is to produce stars. In order to do so, musicians producing music that fits into well defined clusters of factors explaining the demand of...
Ian Pascal Volz
« Prev « First page 1419 / 1661 Last » Next »