We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
The k-means algorithm is a popular clustering method used in many different fields of computer science, such as data mining, machine learning and information retrieval. However, ...
As multilingual products and technology grow in importance, the Linguistic Data Consortium (LDC) intends to provide the resources needed for research and development activities, e...
Background: OmniLog™ phenotype microarrays (PMs) have the capability to measure and compare the growth responses of biological samples upon exposure to hundreds of growth condit...
Wenling E. Chang, Keri Sarver, Brandon W. Higgs, T...
Surveillance systems that operate continuously generate large volumes of data. One such system is described here, continuously tracking and storing observations taken from multiple...