The k-means algorithm is the method of choice for clustering large-scale data sets and it performs exceedingly well in practice. Most of the theoretical work is restricted to the c...
In cluster-based storage systems, the metadata server cluster must be able to adaptively distribute responsibility for metadata to maintain high system performance and long-term l...
While null space based linear discriminant analysis (NLDA) obtains a good discriminant performance, the ability easily suffers from an implicit assumption of Gaussian model with sa...
Many important industrial applications rely on data mining methods to uncover patterns and trends in large data warehouse environments. Since a data warehouse is typically updated...
Recent years have witnessed an explosion in the availability of news articles on the World Wide Web. Although searchengines’ algorithms have made it easier to locate these docum...