Our work is motivated by the problem of managing data on storage devices, typically a set of disks. Such storage servers are used as web servers or multimedia servers, for handling...
Leana Golubchik, Samir Khuller, Yoo Ah Kim, Svetla...
Parallel dataflow programs generate enormous amounts of distributed data that are short-lived, yet are critical for completion of the job and for good run-time performance. We ca...
Steven Y. Ko, Imranul Hoque, Brian Cho, Indranil G...
The results of knowledge discovery in databases could vary depending on the data mining method. There are several ways to select the most appropriate data mining method dynamicall...
Seppo Puuronen, Vagan Y. Terziyan, Alexander Logvi...
Techniques for learning from data typically require data to be in standard form. Measurements must be encoded in a numerical format such as binary true-or-false features, numerica...
V. Seshadri, Raguram Sasisekharan, Sholom M. Weiss
We introduce a new approach for Clustering and Aggregating Relational Data (CARD). We assume that data is available in a relational form, where we only have information about the ...