This paper proposes a statistical model for defining string similarity. The proposed model is based on hidden Markov model and defines string similarity as the combination of simi...
We describe an infrastructure for the collection and management of large amounts of text, and discuss the possibility of information extraction and visualisation from text corpora...
Background: The inference of homology from statistically significant sequence similarity is a central issue in sequence alignments. So far the statistical distribution function un...
Background: The identification of biologically interesting genes in a temporal expression profiling dataset is challenging and complicated by high levels of experimental noise. Mo...
Background: The statistical modeling of biomedical corpora could yield integrated, coarse-to-fine views of biological phenomena that complement discoveries made from analysis of m...
David M. Blei, K. Franks, Michael I. Jordan, I. Sa...