Amharic is the official language of Ethiopia and uses Ethiopic script for writing. In this paper, we present writer-independent HMM-based Amharic word recognition for offline hand...
We present a new, unique and freely available parallel corpus containing European Union (EU) documents of mostly legal nature. It is available in all 20 official EU languages, wit...
Ralf Steinberger, Bruno Pouliquen, Anna Widiger, C...
It is crucial in many information systems to organize short text segments, such as keywords in documents and queries from users, into a well-formed topic hierarchy. In this paper,...
Supervised text classification is the task of automatically assigning a category label to a previously unlabeled text document. We start with a collection of pre-labeled examples ...
Background: The development of text mining systems that annotate biological entities with their properties using scientific literature is an important recent research topic. These...