Sciweavers

5880 search results - page 958 / 1176
» Data Clustering: A Review
Sort
View
WWW
2008
ACM
16 years 7 months ago
Efficient similarity joins for near duplicate detection
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
WWW
2007
ACM
16 years 7 months ago
Using d-gap patterns for index compression
Sequential patterns of d-gaps exist pervasively in inverted lists of Web document collection indices due to the cluster property. In this paper the information of d-gap sequential...
Jinlin Chen, Terry Cook
KDD
2008
ACM
115views Data Mining» more  KDD 2008»
16 years 7 months ago
SPIRAL: efficient and exact model identification for hidden Markov models
Hidden Markov models (HMMs) have received considerable attention in various communities (e.g, speech recognition, neurology and bioinformatic) since many applications that use HMM...
Yasuhiro Fujiwara, Yasushi Sakurai, Masashi Yamamu...
164
Voted
KDD
2008
ACM
135views Data Mining» more  KDD 2008»
16 years 7 months ago
Effective and efficient itemset pattern summarization: regression-based approaches
In this paper, we propose a set of novel regression-based approaches to effectively and efficiently summarize frequent itemset patterns. Specifically, we show that the problem of ...
Ruoming Jin, Muad Abu-Ata, Yang Xiang, Ning Ruan
KDD
2007
ACM
144views Data Mining» more  KDD 2007»
16 years 7 months ago
Fast direction-aware proximity for graph mining
In this paper we study asymmetric proximity measures on directed graphs, which quantify the relationships between two nodes or two groups of nodes. The measures are useful in seve...
Hanghang Tong, Christos Faloutsos, Yehuda Koren