Sciweavers

14577 search results - page 2664 / 2916
» Statistical Language Modelling
Sort
View
KDD
2005
ACM
125views Data Mining» more  KDD 2005»
16 years 7 months ago
Email data cleaning
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
Jie Tang, Hang Li, Yunbo Cao, ZhaoHui Tang
KDD
2004
ACM
134views Data Mining» more  KDD 2004»
16 years 7 months ago
Exploiting a support-based upper bound of Pearson's correlation coefficient for efficiently identifying strongly correlated pair
Given a user-specified minimum correlation threshold and a market basket database with N items and T transactions, an all-strong-pairs correlation query finds all item pairs with...
Hui Xiong, Shashi Shekhar, Pang-Ning Tan, Vipin Ku...
KDD
2001
ACM
163views Data Mining» more  KDD 2001»
16 years 7 months ago
The "DGX" distribution for mining massive, skewed data
Skewed distributions appear very often in practice. Unfortunately, the traditional Zipf distribution often fails to model them well. In this paper, we propose a new probability di...
Zhiqiang Bi, Christos Faloutsos, Flip Korn
RECOMB
2005
Springer
16 years 7 months ago
Learning Interpretable SVMs for Biological Sequence Classification
Background: Support Vector Machines (SVMs) ? using a variety of string kernels ? have been successfully applied to biological sequence classification problems. While SVMs achieve ...
Christin Schäfer, Gunnar Rätsch, Sö...
RECOMB
2004
Springer
16 years 7 months ago
Comparing in situ mRNA expression patterns of drosophila embryos
In situ staining of a target mRNA at several time points during the development of a D. melanogaster embryo gives one a detailed spatio-temporal view of the expression pattern of ...
Hanchuan Peng, Eugene W. Myers
« Prev « First page 2664 / 2916 Last » Next »