In this paper, we propose a new variant of Latent Dirichlet Allocation(LDA): Collective LDA (C-LDA), for multiple corpora modeling. C-LDA combines multiple corpora during learning...
We formulate and study a privacy guarantee to data owners, who share information with clients by publishing views of a proprietary database. The owner identifies the sensitive pro...
An index for an r.e. class of languages (by definition) generates a sequence of grammars defining the class. An index for an indexed family of languages (by definition) generat...
Recently, stability-based techniques have emerged as a very promising solution to the problem of cluster validation. An inherent drawback of these approaches is the computational c...
The Conference on Computational Natural Language Learning features a shared task, in which participants train and test their learning systems on the same data sets. In 2007, as in...