We propose a method of synonymous paraphrasing of a text based on WordNet synonymy data and Internet statistics of stable word combinations (collocations). Given a text, we look fo...
Most new words, or neologisms, bubble beneath the surface of widespread usage for some time, perhaps even years, before gaining acceptance in conventional print dictionaries [1]. A...
In Chinese texts, words composed of single or multiple characters are not separated by spaces, unlike most western languages. Therefore Chinese word segmentation is considered an ...
It is commonly believed that word segmentation accuracy is monotonically related to retrieval performance in Chinese information retrieval. In this paper we show that, for Chinese...
Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick ...
—A common strategy to assign keywords to documents is to select the most appropriate words from the document text. One of the most important criteria for a word to be selected as...