Many sequence labeling tasks in NLP require solving a cascade of segmentation and tagging subtasks, such as Chinese POS tagging, named entity recognition, and so on. Traditional p...
Models of latent document semantics such as the mixture of multinomials model and Latent Dirichlet Allocation have received substantial attention for their ability to discover top...
Daniel David Walker, William B. Lund, Eric K. Ring...
A number of algorithms and approaches have been proposed towards the problem of scanning and digitizing research papers. We can classify work done in the past into three major appr...
Deepank Gupta, Bob Morris, Terry Catapano, Guido S...
Conditional random fields (CRFs) have been quite successful in various machine learning tasks. However, as larger and larger data become acceptable for the current computational ma...
This paper presents a new method to automatically add n-grams containing out-of-vocabulary (OOV) words to a baseline language model (LM), where these n-grams are sought to be gram...