Electronic publishing, and in particular Web-based publishing, has assumed an increasing importance in higher education. The possibility of delivering learning material to students...
Due to the lack of annotated data sets, there are few studies on machine learning based approaches to extract named entities (NEs) in clinical text. The 2009 i2b2 NLP challenge is...
The infinite hidden Markov model is a nonparametric extension of the widely used hidden Markov model. Our paper introduces a new inference algorithm for the infinite Hidden Markov...
Jurgen Van Gael, Yunus Saatci, Yee Whye Teh, Zoubi...
The Dirichlet compound multinomial (DCM) distribution, also called the multivariate Polya distribution, is a model for text documents that takes into account burstiness: the fact ...
Probabilistic modelling of text data in the bagof-words representation has been dominated by directed graphical models such as pLSI, LDA, NMF, and discrete PCA. Recently, state of...