In this paper, we present a framework for categorical data analysis which allows such data sets to be explored using a rich set of techniques that are only applicable to continuou...
Web communities involve networks of loosely coupled data sources. Members in those communities should be able to pose queries and gather results from all data sources in the networ...
Semantically heterogeneous and distributed data sources are quite common in several application domains such as bioinformatics and security informatics. In such a setting, each dat...
Google Fusion Tables is a cloud-based service for data management and integration. Fusion Tables enables users to upload tabular data files (spreadsheets, CSV, KML), currently of...
Hector Gonzalez, Alon Y. Halevy, Christian S. Jens...
Existing data-stream clustering algorithms such as CluStream are based on k-means. These clustering algorithms are incompetent to find clusters of arbitrary shapes and cannot hand...