This paper considers the use of computational stylistics for performing authorship attribution of electronic messages, addressing categorization problems with as many as 20 differ...
Shlomo Argamon, Marin Saric, Sterling Stuart Stein
This paper introduces a new measurement, robustness, to measure the quality of machine-discovered knowledge from real-world databases that change over time. A piece of knowledge i...
In the past few years, some nonlinear dimensionality reduction (NLDR) or nonlinear manifold learning methods have aroused a great deal of interest in the machine learning communit...
Skewed distributions appear very often in practice. Unfortunately, the traditional Zipf distribution often fails to model them well. In this paper, we propose a new probability di...
Background: Cluster analysis is an integral part of high dimensional data analysis. In the context of large scale gene expression data, a filtered set of genes are grouped togethe...