The tree sequence based translation model allows the violation of syntactic boundaries in a rule to capture non-syntactic phrases, where a tree sequence is a contiguous sequence o...
This paper studies two methods for training hierarchical MT rules independently of word alignments. Bilingual chart parsing and EM algorithm are used to train bitext correspondenc...
We develop a similarity measure to detect repeatedly occurring Out-of-Vocabulary words (OOV), since these carry important information. Sub-word sequences in the recognition output...
Mirko Hannemann, Stefan Kombrink, Martin Karafi&aa...
Mapping documents into an interlingual representation can help bridge the language barrier of a cross-lingual corpus. Previous approaches use aligned documents as training data to...
Technical support procedures are typically very complex. Users often have trouble following printed instructions describing how to perform these procedures, and these instructions...
Tessa A. Lau, Lawrence D. Bergman, Vittorio Castel...