Sciweavers

1083 search results - page 123 / 217
» Efficient Discovery of Confounders in Large Data Sets
Sort
View
ACIIDS
2010
IEEE
170views Database» more  ACIIDS 2010»
15 years 4 months ago
On the Effectiveness of Gene Selection for Microarray Classification Methods
Microarray data usually contains a high level of noisy gene data, the noisy gene data include incorrect, noise and irrelevant genes. Before Microarray data classification takes pla...
Zhongwei Zhang, Jiuyong Li, Hong Hu, Hong Zhou
IMCSIT
2010
15 years 4 months ago
Finding Patterns in Strings using Suffixarrays
Abstract--Finding regularities in large data sets requires implementations of systems that are efficient in both time and space requirements. Here, we describe a newly developed sy...
Herman Stehouwer, Menno van Zaanen
ICDE
2007
IEEE
162views Database» more  ICDE 2007»
16 years 7 months ago
On Density Based Transforms for Uncertain Data Mining
In spite of the great progress in the data mining field in recent years, the problem of missing and uncertain data has remained a great challenge for data mining algorithms. Many ...
Charu C. Aggarwal
SIGMOD
2001
ACM
200views Database» more  SIGMOD 2001»
16 years 6 months ago
Data Bubbles: Quality Preserving Performance Boosting for Hierarchical Clustering
In this paper, we investigate how to scale hierarchical clustering methods (such as OPTICS) to extremely large databases by utilizing data compression methods (such as BIRCH or ra...
Markus M. Breunig, Hans-Peter Kriegel, Peer Kr&oum...
SDM
2008
SIAM
135views Data Mining» more  SDM 2008»
15 years 7 months ago
A Spamicity Approach to Web Spam Detection
Web spam, which refers to any deliberate actions bringing to selected web pages an unjustifiable favorable relevance or importance, is one of the major obstacles for high quality ...
Bin Zhou 0002, Jian Pei, ZhaoHui Tang