Text categorization algorithms usually represent documents as bags of words and consequently have to deal with huge numbers of features. Most previous studies found that the major...
Automated detection of the first document reporting each new event in temporally-sequenced streams of documents is an open challenge. In this paper we propose a new approach which...
Yiming Yang, Jian Zhang, Jaime G. Carbonell, Chun ...
We present a study of new word identification (NWI) to improve the performance of a Chinese word segmenter. In this paper the distribution and types of new words are discussed emp...
Most supervised language processing systems show a significant drop-off in performance when they are tested on text that comes from a domain significantly different from the domai...
With the exponential growth of Web contents, Recommender System has become indispensable for discovering new information that might interest Web users. Despite their success in th...