In the paper we show that diagnostic classes in cancer gene expression data sets, which most often include thousands of features (genes), may be effectively separated with simple ...
Gregor Leban, Minca Mramor, Ivan Bratko, Blaz Zupa...
This paper introduces support envelopes--a new tool for analyzing association patterns--and illustrates some of their properties, applications, and possible extensions. Specifical...
The geometric median is a classic robust estimator of centrality for data in Euclidean spaces. In this paper we formulate the geometric median of data on a Riemannian manifold as ...
P. Thomas Fletcher, Suresh Venkatasubramanian, Sar...
Clustering, in data mining, is useful to discover distribution patterns in the underlying data. Clustering algorithms usually employ a distance metric based (e.g., euclidean) simi...
Clustering algorithms typically operate on a feature vector representation of the data and find clusters that are compact with respect to an assumed (dis)similarity measure betwee...