In this paper we investigate ChineseEnglish name transliteration using comparable corpora, corpora where texts in the two languages deal in some of the same topics -- and therefor...
In this paper, a word alignment approach is presented which is based on a combination of clues. Word alignment clues indicate associations between words and phrases. They can be b...
The problem of sequence categorization is to generalize from a corpus of labeled sequences procedures for accurately labeling future unlabeled sequences. The choice of representat...
The paper describes an algorithm that employs English and French text taggers to associate noun phrases in an aligned bilingual corpus. The taggets provide part-of-speech categori...
The purpose of this paper is to characterize a constituent boundary parsing algorithm, using an information-theoretic measure called generalized mutual information, which serves a...