Sciweavers

3133 search results - page 355 / 627
» Representing clusters for retrieval
Sort
View
CIKM
2011
Springer
14 years 6 months ago
Probabilistic near-duplicate detection using simhash
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Sadhan Sood, Dmitri Loguinov
SIGIR
2012
ACM
13 years 9 months ago
Parallelizing ListNet training using spark
As ever-larger training sets for learning to rank are created, scalability of learning has become increasingly important to achieving continuing improvements in ranking accuracy [...
Shilpa Shukla, Matthew Lease, Ambuj Tewari
SIGMOD
2007
ACM
198views Database» more  SIGMOD 2007»
16 years 6 months ago
Addressing diverse user preferences in SQL-query-result navigation
Database queries are often exploratory and users often find their queries return too many answers, many of them irrelevant. Existing work either categorizes or ranks the results t...
Zhiyuan Chen, Tao Li
ICDE
2006
IEEE
118views Database» more  ICDE 2006»
16 years 24 days ago
Evaluation of Placement and Access Asignment for Replicated Object Striping
The number of stored objects that should be targets of high throughput retrieval, such as multimedia stream objects, is increasing recently. To implement a high throughput storage...
Makoto Kataigi, Dai Kobayashi, Tomohiro Yoshihara,...
CIKM
2004
Springer
16 years 4 days ago
A practical web-based approach to generating topic hierarchy for text segments
It is crucial in many information systems to organize short text segments, such as keywords in documents and queries from users, into a well-formed topic hierarchy. In this paper,...
Shui-Lung Chuang, Lee-Feng Chien