Sciweavers

1013 search results - page 114 / 203
» Document Re-ranking by Generality in Bio-medical Information...
Sort
View
CIKM
2004
Springer
15 years 11 months ago
Document clustering based on cluster validation
This paper presents a cluster validation based document clustering algorithm, which is capable of identifying both important feature words and true model order (cluster number). I...
Zheng-Yu Niu, Dong-Hong Ji, Chew Lim Tan
CIKM
2009
Springer
16 years 27 days ago
Improving web page classification by label-propagation over click graphs
In this paper, we present a semi-supervised learning method for web page classification, leveraging click logs to augment training data by propagating class labels to unlabeled si...
Soo-Min Kim, Patrick Pantel, Lei Duan, Scott Gaffn...
KDD
2009
ACM
169views Data Mining» more  KDD 2009»
16 years 1 months ago
On burstiness-aware search for document sequences
As the number and size of large timestamped collections (e.g. sequences of digitized newspapers, periodicals, blogs) increase, the problem of efficiently indexing and searching su...
Theodoros Lappas, Benjamin Arai, Manolis Platakis,...
CIKM
2007
Springer
16 years 15 days ago
Recognition and classification of noun phrases in queries for effective retrieval
It has been shown that using phrases properly in the document retrieval leads to higher retrieval effectiveness. In this paper, we define four types of noun phrases and present an...
Wei Zhang, Shuang Liu, Clement T. Yu, Chaojing Sun...
DGO
2006
134views Education» more  DGO 2006»
15 years 7 months ago
Next steps in near-duplicate detection for eRulemaking
Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...
Hui Yang, Jamie Callan, Stuart W. Shulman