Wedescribea novel approachfor clustering collectionsof sets,andits applicationto theanalysis and mining of categoricaldata. By "categorical data," we meantableswith fiel...
David Gibson, Jon M. Kleinberg, Prabhakar Raghavan
The problem of ethnicity identification from names has a variety of important applications, including biomedical research, demographic studies, and marketing. Here we report on th...
Anurag Ambekar, Charles B. Ward, Jahangir Mohammed...
One of the most well-studied problems in data mining is computing association rules from large transactional databases. Often, the rule collections extracted from existing datamin...
Interactive Visualization has been used to study scientific phenomena, analyze data, visualize information, and to explore large amounts of multivariate data. It enables the human...
Joerg Meyer, Jim Thomas, Stephan Diehl, Brian Fish...
This paper presents an interdisciplinary investigation of statistical information retrieval (IR) techniques for protein identification from tandem mass spectra, a challenging probl...