We present UWN, a large multilingual lexical knowledge base that describes the meanings and relationships of words in over 200 languages. This paper explains how link prediction, ...
Previous work using topic model for statistical machine translation (SMT) explore topic information at the word level. However, SMT has been advanced from word-based paradigm to p...
Xinyan Xiao, Deyi Xiong, Min Zhang, Qun Liu, Shoux...
The information used for the extraction of terms can be considered as rather 'internal', i.e. coming from the candidate string itself. This paper presents the incorporat...
Most topic models, such as latent Dirichlet allocation, rely on the bag-of-words assumption. However, word order and phrases are often critical to capturing the meaning of text in...
The detection of new information in a document stream is an important component of many potential applications. In this paper, a new novelty detection approach based on the identi...