Process technologies play an increasingly important role as the world is being digitalized in nearly every corner. The major obstacles to their massive deployment include reusabil...
Latent Dirichlet Allocation (LDA) is a fully generative approach to language modelling which overcomes the inconsistent generative semantics of Probabilistic Latent Semantic Index...
We describe a system for extracting mentions of terms such as company and product names, in a large and noisy corpus of documents, such as the World Wide Web. Since natural langua...
Einat Amitay, Rani Nelken, Wayne Niblack, Ron Siva...
Abstract. This paper describes a methodology for constructing aligned German-Chinese corpora from movie subtitles. The corpora will be used to train a special machine translation s...
Compile-time program optimizations are similar to poetry: more are written than are actually published in commercial compilers. Hard economic reality is that many interesting opti...