The problem of automatically classifying the gender of a blog author has important applications in many commercial domains. Existing systems mainly use features such as words, wor...
This paper describes a novel method to create a quantitative model of an educational content domain of related practice item-types using learning curves. By using a pairwise test t...
Philip I. Pavlik Jr., Hao Cen, Kenneth R. Koedinge...
Data cleaning is the process of correcting anomalies in a data source, that may for instance be due to typographical errors, or duplicate representations of an entity. It is a cruc...
Recent years have seen the rapid spread of biometric technologies for automatic people recognition. However, security and privacy issues still represent the main obstacles for the ...
We consider the problem of Semi-supervised Learning (SSL) from general unlabeled data, which may contain irrelevant samples. Within the binary setting, our model manages to better...
Kaizhu Huang, Zenglin Xu, Irwin King, Michael R. L...