Classification is one of the basic tasks of data mining in modern database applications including molecular biology, astronomy, mechanical engineering, medical imaging or meteorolo...
Stemming algorithms find canonical forms for inflected words, e. g. for declined nouns or conjugated verbs. Since such a unification of words with respect to gender, number, time, ...
Topic models have been studied extensively in the context of monolingual corpora. Though there are some attempts to mine topical structure from cross-lingual corpora, they require ...
The discovery of characteristic rules is a well-known data mining task and has lead to several successful applications. However, because of the descriptive nature of characteristic...
Retweeting is an important action (behavior) on Twitter, indicating the behavior that users re-post microblogs of their friends. While much work has been conducted for mining text...
Zi Yang, Jingyi Guo, Keke Cai, Jie Tang, Juanzi Li...