Sciweavers

17688 search results - page 366 / 3538
» Data Set Balancing
Sort
View
DILS
2004
Springer
16 years 3 days ago
Heterogeneous Data Integration with the Consensus Clustering Formalism
Meaningfully integrating massive multi-experimental genomic data sets is becoming critical for the understanding of gene function. We have recently proposed methodologies for integ...
Vladimir Filkov, Steven Skiena
ACSC
2005
IEEE
16 years 10 days ago
The Electronic Primaries: Predicting the U.S. Presidency Using Feature Selection with Safe Data Reduction
The data mining inspired problem of finding the critical, and most useful features to be used to classify a data set, and construct rules to predict the class of future examples ...
Pablo Moscato, Luke Mathieson, Alexandre Mendes, R...
188
Voted
SIGMOD
2010
ACM
174views Database» more  SIGMOD 2010»
15 years 11 months ago
Sampling dirty data for matching attributes
We investigate the problem of creating and analyzing samples of relational databases to find relationships between string-valued attributes. Our focus is on identifying attribute...
Henning Köhler, Xiaofang Zhou, Shazia Wasim S...
ISMB
1993
15 years 8 months ago
Knowledge-Based Generation of Machine-Learning Experiments: Learning with DNA Crystallography Data
Thoughit has been possible in the past to learn to predict DNAhydration patterns from crystallographic data, there is ambiguity in the choice of training data (both in terms of th...
Dawn M. Cohen, Casimir A. Kulikowski, Helen Berman
SDM
2007
SIAM
118views Data Mining» more  SDM 2007»
15 years 8 months ago
On Privacy-Preservation of Text and Sparse Binary Data with Sketches
In recent years, privacy preserving data mining has become very important because of the proliferation of large amounts of data on the internet. Many data sets are inherently high...
Charu C. Aggarwal, Philip S. Yu