Sciweavers

5107 search results - page 334 / 1022
» Data Mining and Information Retrieval
Sort
View
WWW
2008
ACM
16 years 7 months ago
Recrawl scheduling based on information longevity
It is crucial for a web crawler to distinguish between ephemeral and persistent content. Ephemeral content (e.g., quote of the day) is usually not worth crawling, because by the t...
Christopher Olston, Sandeep Pandey
ICDM
2010
IEEE
164views Data Mining» more  ICDM 2010»
15 years 4 months ago
Improved Consistent Sampling, Weighted Minhash and L1 Sketching
Abstract--We propose a new Consistent Weighted Sampling method, where the probability of drawing identical samples for a pair of inputs is equal to their Jaccard similarity. Our me...
Sergey Ioffe
WWW
2011
ACM
15 years 1 months ago
Improving recommendation for long-tail queries via templates
The ability to aggregate huge volumes of queries over a large population of users allows search engines to build precise models for a variety of query-assistance features such as ...
Idan Szpektor, Aristides Gionis, Yoelle Maarek
CIKM
2005
Springer
16 years 8 days ago
On the estimation of frequent itemsets for data streams: theory and experiments
In this paper, we devise a method for the estimation of the true support of itemsets on data streams, with the objective to maximize one chosen criterion among {precision, recall}...
Pierre-Alain Laur, Richard Nock, Jean-Emile Sympho...
AICOM
2005
165views more  AICOM 2005»
15 years 6 months ago
Integration of hospital data using agent technologies - A case study
Data retrieval and its integration is one of the major problems that face large and complex health organizations. This is especially relevant when patient information is produced i...
Ricardo João Cruz Correia, Pedro Manuel Vie...