We show that Kolmogorov complexity and such its estimators as universal codes (or data compression methods) can be applied for hypothesis testing in a framework of classical mathe...
: The KVB Data Warehouse enables the Bavarian Association of Statutory Health Insurance (SHI) Physicians to track public health data in a centralised way. In a sophisticated proces...
Katharina Wirtz, Martin Tauscher, Maria Zwerenz, A...
Truecasing is the process of restoring case information to badly-cased or noncased text. This paper explores truecasing issues and proposes a statistical, language modeling based ...
Lucian Vlad Lita, Abraham Ittycheriah, Salim Rouko...
We discuss the annotation with part of speech and lemma of the Dutch PAROLE Internet Corpus. The PAROLE PoS tagger is a combination of statistical taggers. It includes the Markov ...
Collocational prepositional phrases like ten koste van (at the expense of), met het oog op (with an eye on), and onder het mom van (under the pretext of) are patterns of the form ...