We present a probabilistic model for a document corpus that combines many of the desirable features of previous models. The model is called “GaP” for Gamma-Poisson, the distri...
Background: Many algorithms have been developed for deciphering the tandem mass spectrometry (MS) data sets. They can be essentially clustered into two classes. The first performs...
Some telecommunication systems can not afford the cost of repeating a corrupted message. Instead, the message should be somewhat “corrected” by the receiver. In these cases a...
In this paper we present the Slim-tree, a dynamic tree for organizing metric datasets in pages of fixed size. The Slim-tree uses the "fat-factor" which provides a simple ...
Caetano Traina Jr., Agma J. M. Traina, Bernhard Se...
Citation matching, or the automatic grouping of bibliographic references that refer to the same document, is a data management problem faced by automatic digital libraries for sci...
Isaac G. Councill, Huajing Li, Ziming Zhuang, Sand...