Research in the fields of software quality and maintainability requires the analysis of large quantities of data, which often originate from open source software projects. Pre-pro...
Extract-Transform-Load (ETL) workflows are data centric workflows responsible for transferring, cleaning, and loading data from their respective sources to the warehouse. Previous ...
Abstract. A nonparametric Bayesian extension of Independent Components Analysis (ICA) is proposed where observed data Y is modelled as a linear superposition, G, of a potentially i...
In this paper we explore robustness and domain adaptation issues for Word Sense Disambiguation (WSD) using Singular Value Decomposition (SVD) and unlabeled data. We focus on the s...
Discriminative learning methods are widely used in natural language processing. These methods work best when their training and test data are drawn from the same distribution. For...