Traditionally, research in identifying structured entities in documents has proceeded independently of document categorization research. In this paper, we observe that these two t...
We analyze a massive social network, gathered from the records of a large mobile phone operator, with more than a million users and tens of millions of calls. We examine the distr...
Although most time-series data mining research has concentrated on providing solutions for a single distance function, in this work we motivate the need for a single index structu...
An ensemble is a set of learned models that make decisions collectively. Although an ensemble is usually more accurate than a single learner, existing ensemble methods often tend ...
In this paper we present a comprehensive log compression (CLC) method that uses frequent patterns and their condensed representations to identify repetitive information from large ...