Multinomial distributions are often used to model text documents. However, they do not capture well the phenomenon that words in a document tend to appear in bursts: if a word app...
Rasmus Elsborg Madsen, David Kauchak, Charles Elka...
Document recognition involves many kinds of hypotheses: segmentation hypotheses, classification hypotheses, spatial relationship hypotheses, and so on. Many recognition strategie...
Richard Zanibbi, Dorothea Blostein, James R. Cordy
The analysis of handwritten documents from the viewpoint of determining their authorship has great bearing on the criminal justice system. In many cases, only a limited amount of ...
Sargur N. Srihari, Catalin I. Tomai, Bin Zhang, Sa...
The relevance of a web document could be measured not only by its text content, but also by some other factors such as the link connectivity, the usage pattern. In previous data f...
The problem of version detection is critical in many important application scenarios, including software clone identification, Web page ranking, plagiarism detection, and peer-to-...
Deise de Brum Saccol, Nina Edelweiss, Renata de Ma...