Sciweavers

7200 search results - page 974 / 1440
» Self-Organizing Data Mining
Sort
View
172
Voted
DEXAW
2009
IEEE
131views Database» more  DEXAW 2009»
16 years 1 months ago
Clustering of Short Strings in Large Databases
—A novel method CLOSS intended for textual databases is proposed. It successfully identifies misspelled string clusters, even if the cluster border is not prominent. The method ...
Michail Kazimianec, Arturas Mazeika
151
Voted
ICDM
2009
IEEE
141views Data Mining» more  ICDM 2009»
16 years 1 months ago
Scalable Algorithms for Distribution Search
Distribution data naturally arise in countless domains, such as meteorology, biology, geology, industry and economics. However, relatively little attention has been paid to data m...
Yasuko Matsubara, Yasushi Sakurai, Masatoshi Yoshi...
ISMVL
2007
IEEE
100views Hardware» more  ISMVL 2007»
16 years 1 months ago
On the Axiomatization of Generalized Entropic Metrics
Starting from an axiomatization of a generalization of Shannon entropy we introduce a set of axioms for a parametric family of distances over sets of partitions of finite sets. T...
Dan A. Simovici
ADMA
2005
Springer
144views Data Mining» more  ADMA 2005»
16 years 12 days ago
One Dependence Augmented Naive Bayes
In real-world data mining applications, an accurate ranking is same important to a accurate classification. Naive Bayes (simply NB) has been widely used in data mining as a simple...
Liangxiao Jiang, Harry Zhang, Zhihua Cai, Jiang Su
186
Voted
KDD
2005
ACM
106views Data Mining» more  KDD 2005»
16 years 11 days ago
Enhancing the lift under budget constraints: an application in the mutual fund industry
A lift curve, with the true positive rate on the y-axis and the customer pull (or contact) rate on the x-axis, is often used to depict the model performance in many data mining ap...
Lian Yan, Michael Fassino, Patrick Baldasare