Sciweavers

1501 search results - page 165 / 301
» Distributed Data Mining Models as Services on the Grid
Sort
View
ICDM
2010
IEEE
264views Data Mining» more  ICDM 2010»
15 years 4 months ago
Block-GP: Scalable Gaussian Process Regression for Multimodal Data
Regression problems on massive data sets are ubiquitous in many application domains including the Internet, earth and space sciences, and finances. In many cases, regression algori...
Kamalika Das, Ashok N. Srivastava
NIPS
2007
15 years 7 months ago
Mining Internet-Scale Software Repositories
Large repositories of source code create new challenges and opportunities for statistical machine learning. Here we first develop Sourcerer, an infrastructure for the automated c...
Erik Linstead, Paul Rigor, Sushil Krishna Bajracha...
KDD
2007
ACM
182views Data Mining» more  KDD 2007»
16 years 6 months ago
Cleaning disguised missing data: a heuristic approach
In some applications such as filling in a customer information form on the web, some missing values may not be explicitly represented as such, but instead appear as potentially va...
Ming Hua, Jian Pei
GFKL
2007
Springer
139views Data Mining» more  GFKL 2007»
16 years 18 days ago
The Noise Component in Model-based Cluster Analysis
The so-called noise-component has been introduced by Banfield and Raftery (1993) to improve the robustness of cluster analysis based on the normal mixture model. The idea is to ad...
Christian Hennig, Pietro Coretto
KDD
2008
ACM
217views Data Mining» more  KDD 2008»
16 years 6 months ago
Stream prediction using a generative model based on frequent episodes in event sequences
This paper presents a new algorithm for sequence prediction over long categorical event streams. The input to the algorithm is a set of target event types whose occurrences we wis...
Srivatsan Laxman, Vikram Tankasali, Ryen W. White