In the information age, the availability of data on consumer profiles has opened new possibilities for companies to increase their revenue via data mining techniques. One approach...
In this work, we introduce the new problem of finding time series discords. Time series discords are subsequences of a longer time series that are maximally different to all the r...
Recent work in deduplication has shown that collective deduplication of different attribute types can improve performance. But although these techniques cluster the attributes col...
We are given a large database of customer transactions. Each transaction consists of items purchased by a customer in a visit. We present an e cient algorithm that generates all s...
Spatio-temporal, geo-referenced datasets are growing rapidly, and will be more in the near future, due to both technological and social/commercial reasons. From the data mining vie...