The goal of this paper is to improve the prediction performance of fault-prone module prediction models (fault-proneness models) by employing over/under sampling methods, which ar...
An important aspect of Semantic Web technologies is the issue of identity and uniquely identifying resources, which is essential for integrating data across sources. Currently, th...
The discovery and construction of inherent regions in large spatial datasets is an important task for many research domains such as climate zoning, eco-region analysis, public heal...
Abstract. We present a probabilistic model for robust principal component analysis (PCA) in which the observation noise is modelled by Student-t distributions that are independent ...
Background: Agglomerative hierarchical clustering (AHC) is a common unsupervised data analysis technique used in several biological applications. Standard AHC methods require that...