Sciweavers

2875 search results - page 244 / 575
» Trends in Storage Technologies
Sort
View
SIGIR
2004
ACM
16 years 4 days ago
Constructing a text corpus for inexact duplicate detection
As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...
Jack G. Conrad, Cindy P. Schriber
SIGIR
2004
ACM
16 years 4 days ago
Parameterized generation of labeled datasets for text categorization based on a hierarchical directory
Although text categorization is a burgeoning area of IR research, readily available test collections in this field are surprisingly scarce. We describe a methodology and system (...
Dmitry Davidov, Evgeniy Gabrilovich, Shaul Markovi...
SIGIR
2004
ACM
16 years 4 days ago
Evaluating high accuracy retrieval techniques
Although information retrieval research has always been concerned with improving the effectiveness of search, in some applications, such as information analysis, a more specific ...
Chirag Shah, W. Bruce Croft
SIGIR
2003
ACM
15 years 12 months ago
Text categorization by boosting automatically extracted concepts
Term-based representations of documents have found widespread use in information retrieval. However, one of the main shortcomings of such methods is that they largely disregard le...
Lijuan Cai, Thomas Hofmann
SIGIR
2003
ACM
15 years 12 months ago
Automatic ranking of retrieval systems in imperfect environments
The empirical investigation of the effectiveness of information retrieval (IR) systems requires a test collection, a set of query topics, and a set of relevance judgments made by ...
Rabia Nuray, Fazli Can