Sampling methods are a direct approach to tackle the problem of class imbalance. These methods sample a data set in order to alter the class distributions. Usually these methods ar...
Ronaldo C. Prati, Gustavo E. A. P. A. Batista, Mar...
In this paper we address the issue of automatically assigning information status to discourse entities. Using an annotated corpus of conversational English and exploiting morpho-s...
Symbolic data analysis aims at generalizing some standard statistical data mining methods, such as those developed for classification tasks, to the case of symbolic objects (SOs). ...
For two-class classification, it is common to classify by setting a threshold on class probability estimates, where the threshold is determined by ROC curve analysis. An analog fo...
In this paper, we show that stylistic text features can be exploited to determine an anonymous author's native language with high accuracy. Specifically, we first use automat...