Accessing the structured content of PDF document is a difficult task, requiring pre-processing and reverse engineering techniques. In this paper, we first present different methods...
Statistical parsing of noun phrase (NP) structure has been hampered by a lack of goldstandard data. This is a significant problem for CCGbank, where binary branching NP derivation...
The Penn Treebank does not annotate within base noun phrases (NPs), committing only to flat structures that ignore the complexity of English NPs. This means that tools trained on...
We give a priority queue that guarantees the worstcase cost of Θ(1) per minimum finding, insertion, and decrease; and the worst-case cost of Θ(lg n) with at most lg n + O( √ ...
We propose a supervised word sense disambiguation (WSD) method using tree-structured conditional random fields (TCRFs). By applying TCRFs to a sentence described as a dependency t...