Many modern enterprises are collecting data at the most detailed level possible, creating data repositories ranging from terabytes to petabytes in size. The ability to apply sophi...
Sudipto Das, Yannis Sismanis, Kevin S. Beyer, Rain...
In this paper, we describe the JAM system, a distributed, scalable and portable agent-based data mining system that employs a general approach to scaling data mining applications ...
Salvatore J. Stolfo, Andreas L. Prodromidis, Shell...
Fragmentation leads to unpredictable and degraded application performance. While these problems have been studied in detail for desktop filesystem workloads, this study examines n...
Large-scale text categorization is an important research topic for Web data mining. One of the challenges in large-scale text categorization is how to reduce the amount of human e...
Creating, maintaining, or using a digital library requires the manipulation of digital documents. Information workspaces provide a visual representation allowing users to collect,...
Frank M. Shipman III, Hao-wei Hsieh, J. Michael Mo...