As tokenization is usually ambiguous for many natural languages such as Chinese and Korean, tokenization errors might potentially introduce translation mistakes for translation sy...
Xinyan Xiao, Yang Liu, Young-Sook Hwang, Qun Liu, ...
We present a novel Evaluation Metric for Morphological Analysis (EMMA) that is both linguistically appealing and empirically sound. EMMA uses a graphbased assignment algorithm, op...
Ambiguity of entity mentions and concept references is a challenge to mining text beyond surface-level keywords. We describe an effective method of disambiguating surface forms an...
Yiping Zhou, Lan Nie, Omid Rouhani-Kalleh, Flavian...
Dative variation is a widely observed syntactic phenomenon in world languages (e.g. I gave John a book and I gave a book to John). It has been shown that which surface form will b...
We present a web-based algorithm for the task of POS tagging of unknown words (words appearing only a small number of times in the training data of a supervised POS tagger). When ...
Shulamit Umansky-Pesin, Roi Reichart, Ari Rappopor...