We continue the study of approximating the number of distinct elements in a data stream of length n to within a (1? ) factor. It is known that if the stream may consist of arbitra...
Address standardization is a very challenging task in data cleansing. To provide better customer relationship management and business intelligence for customer-oriented cooperates...
In this paper, we present a Gaussian mixture model based approach to capture the spatial characteristics of any target signal in a sensor network, and further propose a temporally...
Dimension attributes in data warehouses are typically hierarchical (e.g., geographic locations in sales data, URLs in Web traffic logs). OLAP tools are used to summarize the measu...
Various data mining applications involve data objects of multiple types that are related to each other, which can be naturally formulated as a k-partite graph. However, the resear...
Bo Long, Xiaoyun Wu, Zhongfei (Mark) Zhang, Philip...