Sciweavers

KDD
2002
ACM
179views Data Mining» more  KDD 2002»
16 years 7 months ago
Combining clustering and co-training to enhance text classification using unlabelled data
In this paper, we present a new co-training strategy that makes use of unlabelled data. It trains two predictors in parallel, with each predictor labelling the unlabelled data for...
Bhavani Raskutti, Herman L. Ferrá, Adam Kow...
KDD
2002
ACM
93views Data Mining» more  KDD 2002»
16 years 7 months ago
Interactive deduplication using active learning
Deduplication is a key operation in integrating data from multiple sources. The main challenge in this task is designing a function that can resolve when a pair of records refer t...
Sunita Sarawagi, Anuradha Bhamidipaty
KDD
2002
ACM
122views Data Mining» more  KDD 2002»
16 years 7 months ago
Customer lifetime value modeling and its use for customer retention planning
We present and discuss the important business problem of estimating the effect of retention efforts on the Lifetime Value of a customer in the Telecommunications industry. We disc...
Saharon Rosset, Einat Neumann, Uri Eick, Nurit Vat...
KDD
2002
ACM
147views Data Mining» more  KDD 2002»
16 years 7 months ago
Sequential cost-sensitive decision making with reinforcement learning
Recently, there has been increasing interest in the issues of cost-sensitive learning and decision making in a variety of applications of data mining. A number of approaches have ...
Edwin P. D. Pednault, Naoki Abe, Bianca Zadrozny
KDD
2002
ACM
134views Data Mining» more  KDD 2002»
16 years 7 months ago
Discovering word senses from text
Categories and Subject Descriptors Information Storage and Retrieval Clustering General Terms Keywords
Patrick Pantel, Dekang Lin
KDD
2002
ACM
182views Data Mining» more  KDD 2002»
16 years 7 months ago
ANF: a fast and scalable tool for data mining in massive graphs
Graphs are an increasingly important data source, with such important graphs as the Internet and the Web. Other familiar graphs include CAD circuits, phone records, gene sequences...
Christopher R. Palmer, Phillip B. Gibbons, Christo...
KDD
2002
ACM
119views Data Mining» more  KDD 2002»
16 years 7 months ago
Evaluating classifiers' performance in a constrained environment
In this paper, we focus on methodology of finding a classifier with a minimal cost in presence of additional performance constraints. ROCCH analysis, where accuracy and cost are i...
Anna Olecka
KDD
2002
ACM
175views Data Mining» more  KDD 2002»
16 years 7 months ago
Mining product reputations on the Web
Knowing the reputations of your own and/or competitors' products is important for marketing and customer relationship management. It is, however, very costly to collect and a...
Satoshi Morinaga, Kenji Yamanishi, Kenji Tateishi,...