High throughput glycoproteomics, similar to genomics and proteomics, involves extremely large volumes of distributed, heterogeneous data as a basis for identification and quantifi...
Satya Sanket Sahoo, Christopher Thomas, Amit P. Sh...
Information in today’s enterprises commonly resides in a variety of heterogeneous data sources, including relational databases, web services, files, packaged applications, and c...
We introduce perturbation kernels, a new class of similarity measure for information retrieval that casts word similarity in terms of multi-task learning. Perturbation kernels mode...
In this paper we propose an extension of the PLSA model in which an extra latent variable allows the model to cocluster documents and terms simultaneously. We show on three datase...
Researchers at the National Institute of Standards and Technology are developing a virtual manufacturing cell. This cell will contain simulation models of a wide range of manufact...