Studies have shown that program comprehension takes up to 45% of software development costs. Such high costs are caused by the lack-of documented specification and further aggrava...
We present DL8, an exact algorithm for finding a decision tree that optimizes a ranking function under size, depth, accuracy and leaf constraints. Because the discovery of optimal...
How do we find a natural clustering of a real world point set, which contains an unknown number of clusters with different shapes, and which may be contaminated by noise? Most clu...
Correlation clustering aims at grouping the data set into correlation clusters such that the objects in the same cluster exhibit a certain density and are all associated to a comm...
The problem of assessing the significance of data mining results on high-dimensional 0?1 data sets has been studied extensively in the literature. For problems such as mining freq...
Aristides Gionis, Heikki Mannila, Panayiotis Tsapa...