In Chinese texts, words composed of single or multiple characters are not separated by spaces, unlike most western languages. Therefore Chinese word segmentation is considered an ...
In this paper, we present HeiNER, the multilingual Heidelberg Named Entity Resource. HeiNER contains 1,547,586 disambiguated English Named Entities together with translations and ...
Wolodja Wentland, Johannes Knopp, Carina Silberer,...
In current phrase-based Statistical Machine Translation systems, more training data is generally better than less. However, a larger data set eventually introduces a larger model ...
We present a method which, given a few words defining a concept in some language, retrieves, disambiguates and extends corresponding terms that define a similar concept in another...
In previous work, we reported dramatic improvements in automatic speech recognition (ASR) and spoken language translation (SLT) gained by applying information extracted from spoke...