Abstract. This paper shows how Wikipedia and the semantic knowledge it contains can be exploited for document clustering. We first create a concept-based document representation b...
Anna Huang, David N. Milne, Eibe Frank, Ian H. Wit...
The co-association (CA) matrix was previously introduced to combine multiple partitions. In this paper, we analyze the CA matrix, and address its difference from the similarity ma...
Abstract. Bayesian spam filters, in general, compute probability estimations for tokens either without considering the email areas of occurrences except the body or treating the s...
This paper proposes a two-step graph partitioning method to discover constrained clusters with an objective function that follows the well-known minmax clustering principle. Compar...
In recent years, the technological advances in mapping genes have made it increasingly easy to store and use a wide variety of biological data. Such data are usually in the form o...