The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper,...
- The activities and function of proteins can potentially be determined by protein sequence motifs. Therefore, obtaining the universally conserved and crossed protein family bounda...
We describe an adaptive method for extracting records from web pages. Our algorithm combines a weighted tree matching metric with clustering for obtaining data extraction patterns...
In this article we present an incremental method for building a mixture model. Given the desired number of clusters K ≥ 2, we start with a two-component mixture and we optimize t...
We propose a new method of classifying documents into categories. We define for each category a finite mixture model based on soft clustering of words. We treat the problem of cla...