This paper describes the architecture and the implementation of a full-scale pronunciation lexicon for Turkish using finite state technology. The system produces at its output, a ...
Abstract—Natural language understanding involves the simultaneous consideration of a large number of different sources of information. Traditional methods employed in language an...
Background: Like text in other domains, biomedical documents contain a range of terms with more than one possible meaning. These ambiguities form a significant obstacle to the aut...
Mark Stevenson, Yikun Guo, Robert J. Gaizauskas, D...
The ability to detect similarity in conjunct heads is potentially a useful tool in helping to disambiguate coordination structures - a difficult task for parsers. We propose a di...
This paper describes a text normalization system for deletion-based abbreviations in informal text. We propose using statistical classifiers to learn the probability of deleting ...