s; these annotations provide an abstract description of the effects of particular linguistic choices, allowing the planner to evaluate these choiceswithout needing any linguistic k...
The Online Database of Interlinear Text (ODIN)1 is a database of interlinear text "snippets", harvested mostly from scholarly documents posted to the Web. Although large...
In this paper, we describe a SVM classification framework of session detection task on both Chinese and English query logs. With eight features on the aspects of temporal and cont...
This paper proposes a novel dewarping technique for document images of bound volumes. This technique is a kind of model fitting techniques for estimating the warp of each text li...
Naive Bayes is often used as a baseline in text classification because it is fast and easy to implement. Its severe assumptions make such efficiency possible but also adversely af...
Jason D. Rennie, Lawrence Shih, Jaime Teevan, Davi...