Finding latent patterns in high dimensional data is an important research problem with numerous applications. The most well known approaches for high dimensional data analysis are...
We introduce a multi-stage ensemble framework, ErrorDriven Generalist+Expert or Edge, for improved classification on large-scale text categorization problems. Edge first trains a ...
Mining different types of communities from web data have attracted a lot of research efforts in recent years. However, none of the existing community mining techniques has taken i...
Qiankun Zhao, Sourav S. Bhowmick, Xin Zheng, Kai Y...
In automated text categorization, given a small number of labeled documents, it is very challenging, if not impossible, to build a reliable classifier that is able to achieve high...
Zenglin Xu, Rong Jin, Kaizhu Huang, Michael R. Lyu...
Topic representation mismatch is a key problem in topic-oriented summarization for the specified topic is usually too short to understand/interpret. This paper proposes a novel ad...