Sciweavers

9520 search results - page 1601 / 1904
» or 2011
Sort
View
CIKM
2011
Springer
14 years 6 months ago
Probabilistic near-duplicate detection using simhash
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Sadhan Sood, Dmitri Loguinov
CIKM
2011
Springer
14 years 6 months ago
Joint inference for cross-document information extraction
Previous information extraction (IE) systems are typically organized as a pipeline architecture of separated stages which make independent local decisions. When the data grows bey...
Qi Li, Sam Anzaroot, Wen-Pin Lin, Xiang Li, Heng J...
CIKM
2011
Springer
14 years 6 months ago
Classifying trending topics: a typology of conversation triggers on Twitter
Twitter summarizes the great deal of messages posted by users in the form of trending topics that reflect the top conversations being discussed at a given moment. These trending ...
Arkaitz Zubiaga, Damiano Spina, Víctor Fres...
CIKM
2011
Springer
14 years 6 months ago
Factorization-based lossless compression of inverted indices
Many large-scale Web applications that require ranked top-k retrieval are implemented using inverted indices. An inverted index represents a sparse term-document matrix, where non...
George Beskales, Marcus Fontoura, Maxim Gurevich, ...
CIKM
2011
Springer
14 years 6 months ago
Estimating selectivity for joined RDF triple patterns
A fundamental problem related to RDF query processing is selectivity estimation, which is crucial to query optimization for determining a join order of RDF triple patterns. In thi...
Hai Huang 0003, Chengfei Liu
« Prev « First page 1601 / 1904 Last » Next »