The problem of statistics and aggregate maintenance over data streams has gained popularity in recent years especially in telecommunications network monitoring, trend-related anal...
Complex biological data generated from various experiments are stored in diverse data types in multiple datasets. By appropriately representing each biological dataset as a kernel ...
Hiroaki Tanabe, Tu Bao Ho, Canh Hao Nguyen, Saori ...
Clustering methods for data-mining problems must be extremely scalable. In addition, several data mining applications demand that the clusters obtained be balanced, i.e., be of ap...
Spatial relationships are important issues for similarity-based retrieval in many image database applications. With the popularity of digital cameras and the related image process...
In this paper we study supervised and semi-supervised classification of e-mails. We consider two tasks: filing e-mails into folders and spam e-mail filtering. Firstly, in a sup...
Irena Koprinska, Josiah Poon, James Clark, Jason C...