Document clustering is useful in many information retrieval tasks: document browsing, organization and viewing of retrieval results, generation of Yahoo-like hierarchies of docume...
Abstract—A variational approach is proposed for the unsupervised assessment of attribute variability of high-dimensional data given a differentiable similarity measure. The key q...
This paper presents the results of classifying Arabic text documents using the N-gram frequency statistics technique employing a dissimilarity measure called the "Manhattan di...
Unsupervised clustering can be significantly improved using supervision in the form of pairwise constraints, i.e., pairs of instances labeled as belonging to same or different clu...
— Protein sequence motifs information is crucial to the analysis of biologically significant regions. The conserved regions have the potential to determine the role of the protei...