- The activities and function of proteins can potentially be determined by protein sequence motifs. Therefore, obtaining the universally conserved and crossed protein family bounda...
We describe an adaptive method for extracting records from web pages. Our algorithm combines a weighted tree matching metric with clustering for obtaining data extraction patterns...
In this article we present an incremental method for building a mixture model. Given the desired number of clusters K ≥ 2, we start with a two-component mixture and we optimize t...
We propose a new method of classifying documents into categories. We define for each category a finite mixture model based on soft clustering of words. We treat the problem of cla...
Gaussian blurring mean-shift (GBMS) is a nonparametric clustering algorithm, having a single bandwidth parameter that controls the number of clusters. The algorithm iteratively sh...