Sciweavers

4971 search results - page 754 / 995
» On Scalable Information Retrieval Systems
Sort
View
CN
2006
163views more  CN 2006»
15 years 6 months ago
A framework for mining evolving trends in Web data streams using dynamic learning and retrospective validation
The expanding and dynamic nature of the Web poses enormous challenges to most data mining techniques that try to extract patterns from Web data, such as Web usage and Web content....
Olfa Nasraoui, Carlos Rojas, Cesar Cardona
PVLDB
2008
99views more  PVLDB 2008»
15 years 6 months ago
Industry-scale duplicate detection
Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...
WWW
2005
ACM
16 years 7 months ago
Automatically learning document taxonomies for hierarchical classification
While several hierarchical classification methods have been applied to web content, such techniques invariably rely on a pre-defined taxonomy of documents. We propose a new techni...
Kunal Punera, Suju Rajan, Joydeep Ghosh
ICSM
2009
IEEE
16 years 1 months ago
On the use of relevance feedback in IR-based concept location
Concept location is a critical activity during software evolution as it produces the location where a change is to start in response to a modification request, such as, a bug repo...
Gregory Gay, Sonia Haiduc, Andrian Marcus, Tim Men...
DOCENG
2009
ACM
16 years 1 months ago
Object-level document analysis of PDF files
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Tamir Hassan