—We consider the problem of inferring and modeling topics in a sequence of documents with known publication dates. The documents at a given time are each characterized by a topic...
Iulian Pruteanu-Malinici, Lu Ren, John William Pai...
Abstract. We propose a semantic tagger that provides high level concept information for phrases in clinical documents. It delineates such information from the statements written by...
In this paper we propose a probabilistic model for online document clustering. We use non-parametric Dirichlet process prior to model the growing number of clusters, and use a pri...
Document images undergo various degradation processes. Numerous models of these degradation processes have been proposed in the literature. In this paper we propose a modelbased r...
Models of latent document semantics such as the mixture of multinomials model and Latent Dirichlet Allocation have received substantial attention for their ability to discover top...
Daniel David Walker, William B. Lund, Eric K. Ring...