Sciweavers

5209 search results - page 835 / 1042
» Multiobjective Data Clustering
Sort
View
ICML
2007
IEEE
16 years 7 months ago
Learning distance function by coding similarity
We consider the problem of learning a similarity function from a set of positive equivalence constraints, i.e. 'similar' point pairs. We define the similarity in informa...
Aharon Bar-Hillel, Daphna Weinshall
WWW
2008
ACM
16 years 7 months ago
Efficient similarity joins for near duplicate detection
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
WWW
2007
ACM
16 years 7 months ago
Using d-gap patterns for index compression
Sequential patterns of d-gaps exist pervasively in inverted lists of Web document collection indices due to the cluster property. In this paper the information of d-gap sequential...
Jinlin Chen, Terry Cook
KDD
2008
ACM
115views Data Mining» more  KDD 2008»
16 years 6 months ago
SPIRAL: efficient and exact model identification for hidden Markov models
Hidden Markov models (HMMs) have received considerable attention in various communities (e.g, speech recognition, neurology and bioinformatic) since many applications that use HMM...
Yasuhiro Fujiwara, Yasushi Sakurai, Masashi Yamamu...
KDD
2008
ACM
135views Data Mining» more  KDD 2008»
16 years 6 months ago
Effective and efficient itemset pattern summarization: regression-based approaches
In this paper, we propose a set of novel regression-based approaches to effectively and efficiently summarize frequent itemset patterns. Specifically, we show that the problem of ...
Ruoming Jin, Muad Abu-Ata, Yang Xiang, Ning Ruan