Data clustering is an important task in many disciplines. A large number of studies have attempted to improve clustering by using the side information that is often encoded as pai...
To learn concepts over massive data streams, it is essential to design inference and learning methods that operate in real time with limited memory. Online learning methods such a...
In this paper, we propose a new way to automatically model and predict human behavior of receiving and disseminating information by analyzing the contact and content of personal c...
Xiaodan Song, Ching-Yung Lin, Belle L. Tseng, Ming...
Essentially all data mining algorithms assume that the datagenerating process is independent of the data miner's activities. However, in many domains, including spam detectio...
Nilesh N. Dalvi, Pedro Domingos, Mausam, Sumit K. ...
Many real life sequence databases, such as customer shopping sequences, medical treatment sequences, etc., grow incrementally. It is undesirable to mine sequential patterns from s...