In this paper, we work on extending a Chinese thesaurus with words distinctly used in various Chinese communities. The acquisition and classification of such region-specific lexic...
- The classifier built from a data set with a highly skewed class distribution generally predicts the more frequently occurring classes much more often than the infrequently occurr...
Natural language processing technology has developed remarkably, but it is still difficult for computers to understand contextual meanings as humans do. The purpose of our work ha...
We describe results of a word sense annotation task using WordNet, involving half a dozen well-trained annotators on ten polysemous words for three parts of speech. One hundred se...
Rebecca J. Passonneau, Ansaf Salleb-Aouissi, Vikas...
This article describes the preparation, recording and orthographic transcription of a new speech corpus, the Nijmegen Corpus of Casual Spanish (NCCSp). The corpus contains around ...