This paper deals with two linguistic phenomena which are usually considered cases of ill-formedness by the computational linguistics community: intersentential ellipsis and coordi...
We describe an approach to simultaneous tokenization and part-of-speech tagging that is based on separating the closed and open-class items, and focusing on the likelihood of the ...