We consider the problem of numerical stability and model density growth when training a sparse linear model from massive data. We focus on scalable algorithms that optimize certain...
Clustering methods usually require to know the best number of clusters, or another parameter, e.g. a threshold, which is not ever easy to provide. This paper proposes a new graph-b...
Semi-structured data such as XML and HTML is attracting considerable attention. It is important to develop various kinds of data mining techniques that can handle semistructured d...
Popularity of mobile devices is accompanied by widespread security problems, such as MAC address spoofing in wireless networks. We propose a probabilistic approach to temporal an...
Projection pursuit was originally introduced to identify structures in multivariate data clouds (Huber, 1985). The idea of projecting data to a lowdimensional subspace can also be ...
Peter Filzmoser, Sven Serneels, Christophe Croux, ...