This paper1 presents a rapid and robust parsing system currently used to learn from large bodies of unedited text. The system contains a multivalued part-of-speech disambiguator a...
We present an approach to multilingual grammar induction that exploits a phylogeny-structured model of parameter drift. Our method does not require any translated texts or token-l...
We propose a novel algorithm for sentiment summarization that takes account of informativeness and readability, simultaneously. Our algorithm generates a summary by selecting and ...
Abstract--We present a tool that facilitates the efficient extension of morphological lexica. The tool exploits information from a morphological lexicon, a morphological grammar an...
This paper presents a new approach for the binarization of seriously degraded manuscript. We introduce a new technique based on a Markov Random Field (MRF) model of the document. ...