Sciweavers

924 search results - page 41 / 185
» Measuring Information Understanding in Large Document Collec...
Sort
View
SPIRE
2007
Springer
16 years 8 days ago
Extending Weighting Models with a Term Quality Measure
Abstract. Weighting models use lexical statistics, such as term frequencies, to derive term weights, which are used to estimate the relevance of a document to a query. Apart from t...
Christina Lioma, Iadh Ounis
EMNLP
2009
15 years 4 months ago
Large-Scale Verb Entailment Acquisition from the Web
Textual entailment recognition plays a fundamental role in tasks that require indepth natural language understanding. In order to use entailment recognition technologies for real-...
Chikara Hashimoto, Kentaro Torisawa, Kow Kuroda, S...
INEX
2005
Springer
15 years 11 months ago
A Flexible Structured-Based Representation for XML Document Mining
This paper reports on the INRIA group’s approach to XML mining while participating in the INEX XML Mining track 2005. We use a flexible representation of XML documents that allo...
Anne-Marie Vercoustre, Mounir Fegas, Saba Gul, Yve...
SIGIR
2009
ACM
16 years 21 days ago
Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce
This paper explores the problem of computing pairwise similarity on document collections, focusing on the application of “more like this” queries in the life sciences domain. ...
Jimmy J. Lin
ICADL
2007
Springer
112views Education» more  ICADL 2007»
16 years 10 days ago
Automated Template-Based Metadata Extraction Architecture
This paper describes our efforts to develop a toolset and process for automated metadata extraction from large, diverse, and evolving document collections. A number of federal agen...
Paul Flynn, Li Zhou, Kurt Maly, Steven J. Zeil, Mo...