Sciweavers

1083 search results - page 108 / 217
» Efficient Discovery of Confounders in Large Data Sets
Sort
View
BMCBI
2008
122views more  BMCBI 2008»
15 years 6 months ago
Effects of dependence in high-dimensional multiple testing problems
Background: We consider effects of dependence among variables of high-dimensional data in multiple hypothesis testing problems, in particular the False Discovery Rate (FDR) contro...
Kyung In Kim, Mark A. van de Wiel
ICDM
2003
IEEE
104views Data Mining» more  ICDM 2003»
15 years 11 months ago
Structure Search and Stability Enhancement of Bayesian Networks
Learning Bayesian network structure from large-scale data sets, without any expertspecified ordering of variables, remains a difficult problem. We propose systematic improvements ...
Hanchuan Peng, Chris H. Q. Ding
KDD
2007
ACM
165views Data Mining» more  KDD 2007»
16 years 6 months ago
Efficient and effective explanation of change in hierarchical summaries
Dimension attributes in data warehouses are typically hierarchical (e.g., geographic locations in sales data, URLs in Web traffic logs). OLAP tools are used to summarize the measu...
Deepak Agarwal, Dhiman Barman, Dimitrios Gunopulos...
ICDE
2009
IEEE
135views Database» more  ICDE 2009»
16 years 8 months ago
Space-Constrained Gram-Based Indexing for Efficient Approximate String Search
Abstract-- Answering approximate queries on string collections is important in applications such as data cleaning, query relaxation, and spell checking, where inconsistencies and e...
Alexander Behm, Shengyue Ji, Chen Li, Jiaheng Lu
ICDM
2009
IEEE
147views Data Mining» more  ICDM 2009»
15 years 4 months ago
Greedy Optimization for Contiguity-Constrained Hierarchical Clustering
The discovery and construction of inherent regions in large spatial datasets is an important task for many research domains such as climate zoning, eco-region analysis, public heal...
Diansheng Guo