This paper proposes a chunking strategy to detect unknown words in Chinese word segmentation. First, a raw sentence is pre-segmented into a sequence of word atoms 1 using a maximum...
Identification of transliterations is aimed at enriching multilingual lexicons and improving performance in various Natural Language Processing (NLP) applications including Cross ...
Abstract. This paper presents a brand-new Slovak text-to-speech system. It was developed within the framework of ARTIC system (primarily designed to synthesize Czech speech) with r...
Activity in social media such as blogs, micro-blogs, social networks, etc is manifested via interaction that involves text, images, links and other information items. Naturally, s...
Background: Since Swanson proposed the Undiscovered Public Knowledge (UPK) model, there have been many approaches to uncover UPK by mining the biomedical literature. These earlier...