The cluster assumption is exploited by most semi-supervised learning (SSL) methods. However, if the unlabeled data is merely weakly related to the target classes, it becomes quest...
To take the first step beyond keyword-based search toward entity-based search, suitable token spans ("spots") on documents must be identified as references to real-world...
Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, ...
In this work, we address the problem of joint modeling of text and citations in the topic modeling framework. We present two different models called the Pairwise-Link-LDA and the ...
Ramesh Nallapati, Amr Ahmed, Eric P. Xing, William...
Versioned textual collections are collections that retain multiple versions of a document as it evolves over time. Important large-scale examples are Wikipedia and the web collect...